Expression systems

ABSTRACT

The invention relates to new expression systems and in particular to an expression system in which a gene of interest is expressed at an optimal level. The invention provides a recombinant expression vector comprising a gene of interest and a selectable marker gene, wherein the selectable marker gene is arranged downstream of the gene of interest and a stop codon associated with the gene of interest is spaced from a start codon of said selectable marker gene at a distance which is sufficient to ensure that translation reinitiation is required before said selectable marker protein is expressed from the corresponding mRNA. Examples of such expression systems are vector viral packaging cell lines and a number of preferred cell lines have been identified.

The present invention relates to new expressions systems, and in particular to expression systems in which a gene of interest is expressed at an optimal level. Particular examples of such expression systems are retroviral packaging cell lines and a number of preferred cell lines have been identified.

The ability of eukaryotic and prokaryotic ribosomes to reinitiate translation at an internal start codon within an mRNA sequence has previously been recognised. Studies have been reported in which the efficiency of the process, which is generally regarded as being low, has been connected with the length of the intercistronic sequence (Kozak (1987) Mol. Cell Biol. 7, 3438-3445). Selection of this sequence or spacer as 70 bp in length, and containing no other start codons, has been previously reported as being optimal for reinitiation in a eukaryotic cell line (Cosset F-L., Virology (1991) 185, 862).

The applicants have found a way in which the inefficiency associated with the translation reinitiation process can be used to good effect.

According to the present invention there is provided a recombinant expression vector comprising a gene of interest and a selectable marker gene, wherein the selectable marker gene is arranged downstream of the gene of interest and a stop codon associated with the gene of interest is spaced from a start codon of said selectable marker gene at a distance which is sufficient to ensure that translation re-initiation is required before said selectable marker protein is expressed from the corresponding mRNA.

The invention further provides a process for producing cell lines in which a gene of interest is expressed, which process comprises transforming host cells with an expression vector comprising said gene of interest and a selectable marker gene, wherein the selectable marker gene is arranged downstream of the gene of interest and a stop codon associated with the gene of interest is spaced from a start codon of said selectable marker gene at a distance which is sufficient to ensure that translation re-initiation is required before said selectable marker protein is expressed from the corresponding mRNA, and selecting those cells where expression of the selectable marker gene may be detected.

Since re-initiation of translation is a relatively inefficient process, this means that the selectable marker protein will be expressed at lower levels than the product of the gene of interest. When the marker protein is expressed at detectable levels, the gene of interest will be expressed at higher levels. This will ensure that during the subsequent selection procedure, only those cell clones which express the gene of interest at higher or optimal levels will survive. Low expressing clones will be eliminated by the selection process.

Cells transformed with the above-described expression vectors form a further aspect of the invention.

The host cells are suitably eukaryotic or prokaryotic host cells, preferably eukaryotic host cells.

The number of nucleotides in the space between the stop codon of the gene of interest and the start codon of the selectable marker will suitably be in the range of from 20-200 nucleotides, preferably from 60-80 nucleotides, even more preferably 70-80 nucleotides.

The vectors used in the process of the invention may be any of the known types, for example expression plasmids or viral vectors.

Selected cells may be cultured and if required, the protein product of the gene of interest isolated from the culture using conventional techniques. Alternatively, expression of the gene of interest may result in other desired effects, for example, where the gene of interest is included as part of a viral packaging construct.

Some experimental and clinical gene transfer protocols require the design of gene transfer vectors suitable for in vivo gene delivery (Miller, A.D. 1992. Nature 357: 455-460). Retroviral vectors are attractive candidates for such applications, because they can provide stable gene transfer and expression (Samarut J. et al., Meth. Enzymol. in press) and because packaging cells have been designed which produce non-replication competent viruses (Miller A. D (1990) Hum Gene Ther. 1 5-14). However currently available recombinant retroviruses suffer from a number of drawbacks.

Packaging cell lines provide in trans the retroviral proteins encoded by the qag, pol, and env genes required to obtain infectious retroviral particles. The qag and pol products are respectively the structural components of the virion cores and the replication machinery (enzymes) of the retroviral particles whereas the env products are envelope proteins responsible for the host-range of the virions and for the initiation of infection and for sensitivity to humoral factors. An ideal packaging cell line should produce retroviruses that only contain the retroviral vector genome, and absolutely no replication-competent genomes or defective genomes encoding some of the viral structural genes.

A number of packaging cell lines designed for human gene transfer have been designed in the past by introducing plasmid DNAs which contain "helper genomes" encoding gag, pol and/or env genes into cells.

Recroviral packaging cell lines are cells that have been engineered to provide in trans all the functions required to express infectious retroviral vectors. A helper genome (or construct or unit), is herein also referred to as "retroviral packaging construct (or unit)" or "packaging-deficient construct (or genome unit)" or "gag-pol/env expression plasmids".

Much efforts has been made to design strategies to optimize the helper-genomes in order (i) to get the highest production of retroviral packaging functions (which correlates which infection titers of retroviral particles) and (ii) to minimise the chance that the helper genome can be transmitted via the viral particles (which may lead to emergence of unwanted retroviral forms).

The first of these packaging cell lines used full length retroviral genomes as helper genomes that had been crippled for important cis-regulated replicative functions (reviewed in Miller, Hum. Gene. Ther. 1: 5-14 1990). In order to reduce the possibility of occurrence of replication-competent viruses and of transfer of virus structural genes, a second generation of safer packaging cell lines has been designed by using two separate and complementary helper genomes which express either gag-pol or env and are packaging-deficient (Miller supra).

The cells into which these helper genomes were introduced were isolated by cotransfecting them with plasmids encoding selectable markers. However, as no selection was applied on the packaging-deficient retroviral genome itself, the helper functions can be lost during the passages of the cells in culture and the current packaging systems provide limited titers of infectious retroviral vectors, usually only of the order of 10⁵ -10⁶ infectious units i.u/ml. Indeed the cotransfection with a plasmid encoding a selectable marker does not directly select the best gag-pol-env-expressing cells.

The invention further provides a retroviral packaging cell line comprising a host cell transformed with (i) a packaging deficient construct which expresses a viral gag-pol gene and a first selectable marker gene, and/or (ii) a packaging-deficient construct which expresses a viral env gene and a second selectable marker gene; wherein a start codon of the first and second selectable markers are spaced from the stop codons of the viral gag-pol gene and the viral env gene respectively by a distance which ensures that reinitiation of mRNA translation is required for expression of marker protein product of said first and/or second selectable marker gene.

The retroviral packaging cell line may be obtained by the above described process which will involve selecting transfected cells which express said first and/or second marker genes.

By using helper constructs which are directly selectable and which provide for high expression of the viral gene, high titre retroviral vectors may be obtained.

Helper constructs for use in the process form a furtner aspect of the invention.

The retroviral vectors prepared from the conventional packaging cell lines are usually not contaminated by replication-competent retroviruses (RCRs). However, recombinant amphotropic murine retroviruses have been shown to arise spontaneously from certain packaging cell lines. The generation of such RCRs involves recombination at least between gag-pol/env packaging sequence and vector sequences (Cosset et al., Virology, (1993) 193: 385-395).

Recombinant RCRs have been associated with the development of lymphomas in some severely immunosuppressed monkeys (Donahue et al., J. Exp Med (1992) 176: 1125-1135). In addition, retroviral vector preparations may also contain, at low frequencies, retroviruses coding for functional envelope glycoproteins (Kozak and Kabat, 1990, J. Virol. 64: 3500-3508) or for gag-pol proteins. Although the pathogenicity of these gag-pol or env recombinant retroviruses is probably low, more evolved recombinant retroviruses with higher pathogenic potential may occur when injected in vivo, by recombination and/or complementation of the initial recombinant viruses with some endogenous retroviruses.

In a preferred embodiment of the retroviral packaging cell lines of the invention, the overlapping sequences between the genomes of the retroviral vector and the helper construct are reduced, for-example as compared to constructs such as CRIPenv and CRIPAMgag (Danos et al., Proc. Natl. Acad. Sci USA 85: 6460-6464). In particular, the viral sequences in the helper construct are reduced, for example, not only the packaging sequence but also the 3' Long Terminal Repeat (LTR), the 3' non-coding sequence and/or the 5' LTR may be eliminated.

The possibility of generation of such RCRs and recombinant retroviruses can be reduced by reducing the overlapping sequences between the genomes of both the retroviral vector and the helper construct.

Conventional retroviral vectors are strongly inactivated by human serum which makes them of limited or no use for in situ gene transfer in gene therapy applications. It has previously been shown that inactivation by complement in human serum is controlled by the cell line used to produce the virions and by viral envelope determinants (Takeuchi et al., J. Virol (1994) 68: 8001-8007). In particular, inactivation is caused by some properties of the cell lines that have been used to construct the packaging cells (NIH-3T3) and also by viral determinants located in the retroviral envelope as shown (Takeuchi et al., J. Virol (1994) 68: 8001-8007). In vivo gene delivery is an important goal for a number of human gene therapy strategies.

The applicants have found that certain cell lines form preferred packaging cell lines.

Particularly preferred packaging cell lines are the HT1080 line, the TE671 line, the 3T3 line, the 293 line and the Mv-1-Lu line. One example of retroviral packaging cells that will produce complement-resistant virus comprise human HT1080 cells and express RD114 envelope. Such cells form a preferred aspect of the invention.

Packaging cell lines according to the invention provide 50-100 fold increased titers of retroviral vectors as compared to conventional packaging cell lines. Retroviral vectors provided by these new cells are safe, in terms of generation of RCRs, and considerably more resistant to inactivation by human complement.

Packaging cell lines according to the invention may be able to transduce helper-free, human complement-resistant retroviral vectors at titers consistently higher than 10⁷ i.u./ml.

Suitable semi-packaging cell lines in accordance with the invention are those which express only the gag-pol genes. Such cell lines may suitably be derived from TE671, MINK Mv-1-Lu, HT1080, 293 or NIH-3T3 cells by introduction of plasmid CeB (the MoMLV gag-pol expression unit).

Particularly preferred expression vectors in accordance with the invention for use in retroviral packaging cell lines are those which include MLV gag and pol genes such as CeB. Other plasmids may include gag and pol genes from other retroviruses or chimeric or mutated gag and pol genes.

Various viral and retroviral envelope genes may be included in the plasmids such as MLV-A envelope, GALV envelope, VSV-G protein, BaEV envelope, RD114 envelope and chimeric or mutated envelopes. Plasmids which include the RD114 env gene such as FBdelPRDSAF as illustrated hereinafter, provide one example of suitable constructs.

The novel retroviral packaging cells described hereinafter, have been designated FLY cells, and may be designed for in vivo gene delivery.

Considerable variations were found between the various cell lines screened for their ability to release type C mammalian retroviruses. In addition, few cell lines were able to produce retroviruses completely resistant to human complement. Based on these two criteria, human fibrosarcoma HT1080 and rhabdomyosarcoma TE671 cells were selected for optimum construction of packaging cells. Other studies have shown the importance of endogenous retrovirus expression in the generation of recombinant retroviruses from retroviral packaging lines (Ronfort et al., Virology, (1995), 207, 271-275, Vanin, E. F. et al., J Virol (1994) 68: 4241-4250.). The co-packaging of an endogenous genome and a vector can lead to emergence of recombinant retroviruses (Vanin et al., supra). Recombination involves template switching during reverse transcription of such hybrid retroviruses (Hu et al., Science, (1990) 250: 1227) and homologies between the two genomes considerably enhance the frequency of reverse transcriptase jumps (Zhang et al., J. Virol. (1994) 68: 2409-2414). Therefore an ideal packaging cell line should not express endogenous MLV-like (or type C retrovirus-like) retroviral genomes which can be packaged by type C gag proteins (Scadden et al., J. Virol. (1990) 64: 424-427, Torrent et al., J. Mol. Biol. (1994) 240 434-444).

Packaging of human endogenous retroviral RNA was not detected in TELCeB and FLY packaging cells when virion associated RNA was analysed by RT-PCR using generic primers. HT1080- and TE671 derived packaging cell lines may be safer in this respect than those generated from NIH3T3 cells, such as GP+EAM12 cells, which are known to express and package sequences related to type C retroviruses (Scadden et al. supra).

To generate the FLY packaging cell lines, HT1080 cells were transfected with gag-pol and env expression plasmids designed to optimise viral protein expression. Direct selection for viral gene expression was achieved in accordance with the invention by expression of a selectable marker gene by re-initiation of translation of the mRNA expressing the viral proteins. This strategy resulted in packaging cell lines capable of producing extremely high titer viruses. Furthermore, long-term expression of packaging functions can be maintained in these cells. Many unnecessary viral sequences were eliminated from the packaging constructs to reduce the risk of helper virus generation; indeed the final packaging cells did not produce helper virus, in that no replication competent virus (RCR) could be detected per 10⁷ vector particles.

The FLY packaging cells described herein are safer than, for example, psiCRIP cells, at least for generation of env recombinant retroviruses as is illustrated in Table 4 hereinafter, probably because less retroviral sequences overlapping with the vector were present in the present env-expression plasmid. Few reports have addressed the question of the characterization of recombinant retroviruses (RVs) (Cosset, F. L., et al., Virology (1993) 193: 385-395). It is possible that such RVs could not be detected in previous packaging cell lines due to lower overall titers. RVs are defective in normal cell culture conditions but are likely to evolve to replication competent viruses if they are allowed to replicate in cells complementing their expression like co-cultivated packaging cells (Bestwick et al., Proc. Natl Acad Sci USA, (1988) 85: 5404-5408, Cosset et al., (1993) supra)

In preferred retroviral packaging systems according to the invention, RVs are eradicated for example by removal of viral LTRs from the packaging construct.

Consistent with our previous studies (Takeuchi, Y., et al., J Virol (1994) 68: 8001-8007), LacZ(RD114) and lacZ(MLV-A) pseudotypes produced from HT1080 and TE671cells were more resistant to human complement than LacZ(RD114) or LacZ(MLV-A) pseudotypes produced by 3T3 of dog cells. It was therefore decided to use RD114 and MLV-A env genes to generate recombinant virions with MoMLV cores.

The sequence of RD114 env gene was determined and is shown in FIG. 4 (SEQ ID NO: 1). It was found to be very close to BaEV (baboon endogenous virus) a type C retrovirus (Benveniste, R. E. et al., Proc. Natl. Acad. Sci. USA (1973) 70: 3316-3320; Kato, S. et al., Japan. J. Genet. (1987) 62: 127-137) with an envelope gene displaying similarities to the external part of type D simian retroviruses (SRVs). RD114 uses the SRV receptor on human cells (Sommerfelt & Weiss, Virology (1990) 176: 58-69; Sommerfelt, M. A. et al., J Virol (1990) 64: 6214-6220) making the FLY packaging cells with RD114 envelope capable of generating virions with different tropism. Retroviral vectors prepared so far for human gene therapy have used either MLV-A or GALV (gibbon ape leukemia virus) envelopes which display some similarities (Battini, J. L.,et al., J Virol. (1992) 66: 1468-1475) and which use two related cell surface receptors for infection (Miller, D. G. et al., J Virol (1994) 68: 8270-8276). Differences in tissue-specific expression of MLV-A or GALV receptors have been reported (Kavanaugh et al., Proc Natl Acad Sci USA (1994) 91: 7071-7075).

The invention will now be particularly described by way of example with reference to the accompanying drawings in which:

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1.illustrates the structure and expression of CeB. The env gene (Xbal-Clal) of plasmid pCRIP was removed and was replaced by coinsertion of the two fragments Xbal-Sfil (restriction sites underlined) from pOXEnv and a Sfil-Clal PCR product containing the bsr selectable marker. This results in positioning the bsr start codon (shadowed) 74 bp downstream to the pol stop codon (bold). The sequence shown in the figure corresponds to SEQ ID NO: 28.

Open triangle are start codons (gag and bsr), black triangles are stop codons (pol and bsr). The shadowed triangle is the start codon of env, in the same reading frame with that of bsr. SD and SA are the splice donnor and splice acceptor sites.

FIG. 2 illustrates the structure and expression of FbdelPASAF.

Immediately after the stop codon of env (bold) was inserted a non retroviral Kasl-Ncol (restriction sites underlined) linker which positions the phleo start codon (shadowed) 76 bp downstream. Open triangle are start codons (env and phleo), black triangles are stop codons (env and phleo). SD and SA are the splice donnor and splice acceptor sites. The sequence shown in the figure corresponds to SEQ ID NO: 29.

FIG. 3 illustrates plasmids for expression of Ampho, Eco, RD114, Xeno, 10A1, GALV, VSV-G and FeLVB envelopes. All genes are expressed in the same backbone as detailed in FIG. 2. The BglII sites for ecotropic (MoMLV strain), 10A1, xenotropic (NZB.1.V6 strain) and amphotropic (4070A strain), the Ndel site of RD114 (SC3C strain, the BamHl site for both FeLVB and GALV were used as 5' ends, and linked to Mscl site immediately after the splice donor site in the leader of FB29 LTR.

FIG. 4 shows the sequence of the RD114 env gene (SEQ ID No 1).

FIG. 5 shows the genetic structure of gag-pol constructs. Initiation (∇) and termination (▾) codons are shown. The thick dotted line below each construct shows MLV-derived sequences. Nucleotide positions of MLV-derived sequences are shown according to: Shinnick et al. (1981) (from nt 1 to nt 6000 with deletion of the packaging signal (DY) from BalI (nt 215) to PstI (nt 568), and with some further MOMLV sequences in both CeB and CeB DS- from nt 7676 to nt 7938. gag-pol and bsr genes were expressed from the same transcription unit using the either a retroviral promoter (Mo LTR) or a non retroviral promoter (hCMV) and non retroviral polyadenylation sequence (polyA). Splice donor (SD) and acceptor (SA) sites are indicated. The thin line denotes retroviral non coding sequences. The thick line shows the rabbit beta-1 globin intron B. The position of some restriction sites is indicated.

FIG. 6. FIG. 6 shows the nucleic acid sequence of a portion of CeB (SEQ ID NO:2).

FIG. 7. FIG. 7 shows the nucleic acid sequence of a portion of hCMV+intron (SEQ ID NO:3).

FIG. 8. FIG. 8 shows the nucleic acid sequence of a portion of hCMV+intronka (SEQ ID NO:4).

FIG. 9. FIG. 9 shows the nucleic acid sequence of a portion of FbdelPASAF (SEQ ID NO:5).

FIG. 10. FIG. 10 shows the nucleic acid sequence of a portion of FbdelPMOSAF (SEQ ID NO:6).

FIG. 11. FIG. 11 shows the nucleic acid sequence of a portion of FbdelPGASAF (SEQ ID NO:7).

FIG. 12. FIG. 12 shows the nucleic acid sequence of a portion of FbdelPRDSAF (SEQ ID NO:8).

FIG. 13. FIG. 13 shows the nucleic acid sequence of a portion of CMV10A1 (SEQ ID NO:9).

The components of the viral particles are produced by two independent expression plasmids (gag-pol or env) which also contain selectable markers (bsr or Phleo) expressed from the same transcriptional units as gag-pol or env (FIGS. 1& 2). The selectable markers are located downstream to gag-pol or env genes and there is an optimal distance between the stop codon of the upstream reading frames and the start codon of the selectable genes that should allow re-initiation of translation (Kozak, Mol Cell Biol. (1987) 7,: 3438-3445). Because there is no "Kozak" sequence (Kozak, Cell,(1986) 44: 283-292) required for a normal initiation of translation for the marker gene, they can only be expressed by re-initiation of translation after the upstream viral gene has been successfully expressed. Consequently and also because re-initiation of translation is a poorly efficient process, after transfection of these plasmids, cells resistant to the drugs corresponding to those selectable genes express high levels of the viral proteins.

To avoid viral transmission of these "helper" genomes the constructs used suitably have the classical deletions of both the packaging sequence located in the leader region and of the 3' LTR, the latter being replaced by SV40 polyadenylation sequences (FIGS. 1 & 2).

Plasmid CeB is the MoMLV gag-pol-expression unit. It derives from pCRIP, a plasmid used to generate the constructs introduced in the CRIP and CRE packaging cell lines (Danos and Mulligan, 1988). As shown in FIG. 1 for generation of plasmid CeB the env gene of pCRIP has been deleted mostly and the bsr selectable marker, -encoding a protein conferring resistance to blasticidin (Izumi et al., Experimental Cell Research (1991) 197, 229-233)- has been inserted downstream to pol gene. There are exactly 74 bp with no ATG triplets between the stop codon of pol and the start codon of bsr, this allows its expression by re-initiation of translation on the gag-pol mRNA, after translation of the gag-pol reading frame.

FbdelPASAF is a plasmid expressing the amphotropic env gene and the phleo selectable marker conferring resistance to phleomycin (Gatignol et al., FEBS Letters (1988) 230: 171-175). By using a PCR-mediated mutagenesis strategy which modifies the end of env gene (see FIG. 2), a 76 bp linker was inserted between the stop codon of env and the start codon of phleo. This allows expression of phleo from the env mRNA by re-initiation of translation. In addition compared to known env-expressing constructs, this strategy of construction has reduced the length of sequences overlapping with the ends of conventional retroviral vectors. The env genes of Mo-MLV, FeLVB, NZB.1V6, 10A1, GALV and RD114 are expressed by plasmids FBdelPMoSAF, FBdelPBSAF, FBdelXSAF, FBdelpGSAF, FBdelp10A1SALF and FBdelPRDSAF, respectively, by using the same backbone as FBdelPASAF (FIG. 3). Retroviral vectors produced with the RD114 envelope will be useful for in vivo gene delivery as comparatively to MLV ecotropic or amphotropic envelopes, virions pseudotyped with RD114 envelopes are not inactivated by human complement when they are produced by Mink Mv-1-Lu cells or by some human cells (Table 1).

The HT1080 cell line, isolated from a human fibrosarcoma (ATCC CCL121). The TE671 cell line isolated from a human rhabdomyosarcoma (ATCC CRL 8805)(purchased from ATCC, and tested for absence of usual cell culture contaminants by ECACC), has been used for the definitive construction of packaging cell lines. HT1080 line was chosen among a panel of primate and human lines because MLV-A and RD114 efficiently rescued retroviral vectors from these cells and also because RD114 pseudotypes produced by this cell line were stable when incubated in human serum. In a standard assay (Takeuchi et al., J Virol (1994), 68, 8001-8007), these latter viruses were found more than 500 fold more stable than similar pseudotypes produced in 3T3 cells.

Another advantage for the use of non murine cells to derive packaging lines is the absence of MLV-related endogenous retroviral-like sequences (like VL30 in 3T3 cells) that can cross-package with MLV-derived retroviral vectors (Torrent et al., 1994) and generate potentially harmful recombinant retroviruses.

The helper constructs were introduced into other cell lines (HT1080 (table 2) Mink Mv-1-Lu (table 2)), 3T3 (not shown), TE671 (table 2)) for the purpose of comparisons of the efficiency of the constructs.

As illustrated hereinafter (Table 2), the reverse transcriptase (RT) activity (provided by expression of the pol gene) in cells transfected with CeB is significantly higher than that of the same cells transfected by the parental plasmid pCRIP or that of cells chronically infected by MLV. This enhancement of viral gene expression is correlated with the titers of lacZ retroviral vectors when an envelope is provided in CeB-lacZ cells after comparison with titers of lacZ pseudotypes of either replication-competent viruses or other helper-free packaging systems.

For the generation of final packaging cell lines, the best clonal env transfectants have been selected. Packaging systems obtained in this way will be able to produce helper-free retroviral vectors at titers greater than 10⁸ infectious particles per ml, which would be 10-100 fold higher to helper-free preparations of others.

Because of the way the selectable markers are expressed (see above), growing the packaging cells in phleamycin and blasticidin selective pressure increase and stabilize the expression of the retroviral components and particularly the envelopes, as it is possible that env glycoproteins have toxic effects for the producer cells in the long term which may lead to a decrease of expression.

Such an enhancement of viral production observed with the packaging systems described herein might increase the emergence of unwanted retroviruses having recombined between the genomes of both the retroviral vector and either of the two packaging-deficient constructs. However, the constructs have been designed in such a way that it reduces the probability of emergence of recombinant viruses compared to the parental constructs. To check their safety, attempts have been made to detect the presence of replication-competent retroviruses by a mobilisation assay of a lacZ provirus. No RC viruses have been found in all retroviral vector preparations tested so far.

The following Examples illustrate the invention.

EXAMPLE 1

Preparation of Cell lines and viruses

The following cell lines were used: A204 (ATCC HTB 82), HeLa (ATCC CCL2), HT1080 (ATCC CCL121), MRC5 (ATCC CCL171), T24 (ATCC HTB 4), VERO (ATCC CCL81) and D17 (ATCC CCL183) were purchased from ATCC.

HOS, TE671 and Mv-1-Lu cells and their clones harboring MFGnlslacZ retroviral vector as described by Takeuchi et al., J Virol (1994), 68, 8001-8007.

The above cell lines were grown in DMEM (Gibco-BRL, U.K.) supplemented with 10% fetal calf serum.

EB8 (Battini et al., J. Virol (1992) 66: 1468-1475); psiCRE, psiCRELLZ and psiCRIP (Danos et al., Proc. Natl. Acad. Sci USA (1988) 85: 6460-6464); Cells GP+EAM12 (Markowitz et al., Virology (1988), 167, 400-406); and NIH-3T3 murine fibroblasts.

These cell lines were grown in DMEM (GIBCO-BRL, U.K.) supplemented with 10% new-born calf serum. Mv-1-Lu, TE671 and HT1080 cells were transfected using calcium-phosphate precipitation method (Sambrook et., "Molecular Cloning" 1989, Cold Spring Harbour Laboratory Press: N.Y.) as described elsewhere (Battini et al., supra). CeB-transfected Mv-1-Lu, TE671 and HT1080 cells were selected with 3, 6-8 and 4 μg/ml of blasticidin S (ICN, UK), respectively, and blasticidin-resistant colonies were isolated 2-3 weeks later. Cells transfected with the various env-expression plasmids were selected with phleomycin (CAYLA, France): 50 μg/ml (for FBASALF-transfected cells) or 10 μg/ml (for FBASAF-, FbdelPASAF-, FbdelPMOSAF, FBdelPIOAISAF or FBdelPRDSAF-transfected cells). Phleomycin-resistant colonies were isolated 2-3 weeks later.

Production of lacZ pseudotypes using replication competent viruses, amphotropic murine leukemia virus (MLV-A) 1504 strain and cat endogenous virus RD114, was carried out as described previously (Takeuchi et al., J Virol (1994), 68, 8001-8007).

EXAMPLE 2

Preparation of Plasmids

The env gene of pCRIP (Danos et al., supra) was excised by HpaI/ClaI digestion. A 500 bp PCR-generated DNA fragment was obtained using pSV2-bsr (Izumi et al., Experimental Cell Research (1991), 197, 299-233) as template and a pair of oligonucleotides:

(5'>CGGAATTCGGATCCGAGCTCGGCCCAGCCGGCCACCATGAAAACATTTAACATTTC TC) (SEQ ID NO: 10) at 5' end and

(5'>GATCCATCGATAAGCTTGGTGGTAAAACTTTT) (SEQ ID No 11) at 3' end, with SfiI and ClaI sites, respectively. This fragment was inserted in HpaI/ClaI sites of pCRIP by co-ligation with a 85 bp HpaI/SfiI DNA fragment isolated from pOXEnv (Russell et al., Nucleic Acids Research (1993), 21, 1081-1085) which provides the end of the Moloney murine leukemia virus (MOMLV) pol gene. The resulting plasmid named CeB (FIG. 1) could express the MOMLV gag-pol gene as well as the bsr selectable marker conferring resistance to blasticidin S, both driven by the MoMLV 5' LTR promoter.

A series of env-expression plasmids was generated using the 4070A MLV (amphotropic) env gene (Ott et al., J Virol (1990), 64, 757-766) and the FB29 Friend MLV promoter (Perryman et al., Nucleic Acid Res (1991), 19, 6950). In FBASALF (FIG. 1) a BglII/ClaI fragment containing the env gene was cloned in BamHI/ClaI sites of plasmid FB3LPh which also contained the C57 Friend MLV LTR driving the expression of the phleo selection marker. A 136 bp env fragment was generated by PCR using plasmid FB3 (Heard et al., J Virol (1991), 65, 4026-4032) as template and a pair of oligonucleotides: (5'>GCTCTTCGGACCCTGCATTC) (SEQ ID NO 12) at 5' end (before ClaI site) and (5'>TAGCATGGCGCCCTATGGCTCGTACTCTATAGGC)(SEQ ID NO 13) at 3' end, providing a KasI restriction site immediately after the env stop codon. This PCR fragment was digested using ClaI and KasI. A DNA fragment containing the FB29 LTR and the MLV-A env gene was obtained by NdeI/ClaI digestion of FBASALF. The fragments were co-ligated in NdeI/KasI digested pUT626 (kindly provided by Daniel Drocourt, CAYLA labs, France). In the resulting plasmid, named FBASAF (FIG. 1), the phleo selectable marker was expressed from the same mRNA as the env gene. A BglII restriction site was created after the MscI site at position 214 in the FB29 leader by using a commercial linker (Biolabs, France). A NdeI/BglII fragment containing the FB29 LTR was co-inserted with the BglII/ClaI env fragment in NdeI/ClaI-digested FBASAF plasmid DNA, resulting in plasmid FBdelPASAF (FIG. 1). Compared to FBASAF, FBdelPASAF has a 100 bp larger deletion in the leader region.

EXAMPLE 3

Cloning and Sequencing of the RD114 env gene

The RD114 env gene was first sub-cloned in plasmid Bluescript KS+ (Stratagene) as a 3 Kb HindIII insert isolated from SC3C, an RD114 infectious DNA clone (Reeves et al., J. Virol (1984), 52, 164-171). A 2.7 kb Scal-Hind III fragment of this subclone containing the RD114 env gene was sequenced (FIG. 4 (SEQ ID NO 1)--EMBL accession number; X87829). The 5' non-coding sequence upstream of an NdeI site was deleted by an EcoRI/NdeI digestion followed by filling-in with Kienow enzyme and self-ligation. From this plasmid, two DNA fragments were obtained: a BamHI/NcoI 2.5 Kb fragment and a 63 bp PCR-generated DNA fragment using (5'>CGCCTCATGGCCTTCATTAA) (SEQ ID NO 14) at 5' end (before NotI site) and (5'>TAGCATGGCGCCTCAATCCTGAGCTTCTTCC) (SEQ ID NO 15) at 3' end, providing a KasI restriction site just after RD114 env gene stop codon. The PCR fragment was digested with NcoI and KasI. Both fragments were co-inserted between BglII and KasI sites of FBdelPASAF and the resulting plasmid was named FBdelPRDSAF (FIG. 1).

Plasmid pCRIPAMgag- (Danos, O. et al., Proc Natl Acad Sci USA (1988) 85: 6460-6464) was used for transfection.

EXAMPLE 4

Infection Assays

Target cells were seeded in 24-multiwell plates (4×10⁴ cells per well) and were incubated overnight. Infections were then carried out at 37° C. by plating 1 ml dilutions of viral supernatants in the presence of 4 μg/ml polybrene (Sigma) on target cells. 3h later virus-containing medium was replaced by fresh medium and infected cells were incubated for two days before X-gal staining, performed as previously described (Tailor et al., J Virol (1993), 67, 6737-6741, Takeuchi et al., J Virol (1994), 68, 8001-8007). Viral titers were determined by counting lacZ-positive colonies as previously described (Cosset et al., J. Virol. (1990) 64: 1070-1078). Stability of lacZ pseudotypes in fresh human serum was examined by titrating surviving virus after incubation in 1:1 mixture of virus harvest in serum-free medium and fresh human serum for 1 h at 37° C. as described before (Takeuchi et al. supra).

EXAMPLE 5

Reverse Transcriptase (RT) Assay.

RT assays were performed either as described previously (Takeuchi et al. supra) or using an RT assay kit (Boehringer Mannheim, U.K.) following the manufacturer's instruction but using MnCl₂ (2 mM) instead of MgCl₂.

EXAMPLE 6

Screening Producer Cell Lines

Viral particles generated with RD114 envelopes have been found to be more stable in human serum than virions with MLV-A envelopes and that the producer cell line also controls sensitivity (Takeuchi et al. supra). A panel of cell lines was screened for their ability to produce high titer viruses and for the sensitivity of these virions to human serum. To do this, cells were infected at high multiplicity with lacZ pseudotypes of either MLV-A or RD114 and cells producing helper-positive lacZ pseudotypes were established. Human HT1080 and TE671 and mink Mv-1-Lu cells were found to release high titer lacZ(RD114) and lacZ(MLV-A) viruses. LacZ(MLV-A) pseudotypes produced by HT1080 cells were more resistant to human serum than those produced by other cells. The titer of these viruses was only four-fold less following a 1 hr incubation with human serum than a control incubation (Table 1). LacZ(RD114) pseudotypes produced by human cells or mink Mv-1-Lu cells were in general stable in human serum (Table 1). These results suggested that HT1080, TE671 and Mv-1-Lu cells provided the best combination of high lacZ titers and resistance to human serum and they were therefore used for the generation of retroviral packaging cells.

                  TABLE 1                                                          ______________________________________                                         Titer and stability of lacZ pseudotypes.                                           Producer  LacZ (MLV-A)   LacZ (RD114)                                      cell      Titer.sup.a                                                                             Stability.sup.b                                                                          Titer.sup.a                                                                           Stability.sup.b                            ______________________________________                                         A204      650       <3       1,200  105                                          HeLa 9 nd 2,000 115                                                            HOS 4,500 6 23,000  86                                                         HT1080 2,000,000 26  400,000 129                                               MRC-5 450 10  1,000 nd                                                         T24 350 nd 1,200 nd                                                            TE671 15,000 2 90,000  38                                                      VERO 260 nd 90 nd                                                              D17 900 <1  200,000  1                                                         Mv-1-Lu 80,000 1 200,000 120                                                 ______________________________________                                          .sup.a Titration on TE671 cells as lacZ i.u./ml                                .sup.b % of infectivity of human serumtreated viruses compared to fetal        calf serumtreated viruses                                                

EXAMPLE 7

Construction of an Improved gag-pol Expression Vector

A MoMLV gag-pol expression plasmid, CeB (FIG. 1), was derived from pCRIP (Danos et al., Proc. Natl. Acad Aci USA (1988) 85: 6460-6464). Approximately 2 Kb of env sequence were removed from PCRIP and the bsr selectable marker, conferring resistance to blasticidin S (Izumi et al., Experimental Cell Research (1991) 197: 229-233), was inserted 74 nts downstream of the gag-pol gene. This 74 nts interval had no ATG triplets and was thought to provide an optimal distance between the stop codon of the pol reading frame and the start codon of the bsr gene to allow re-initiation of translation (Kozak Mol Cell Biol., 1987, 7: 3438-3445). There was no "Kozak" consensus sequence (Kozak Cell, (1986) 44: 283-292) at the 5' end of the marker gene. Therefore, bsr could only be expressed by re-initiation of translation after the upstream gag-pol gene had been expressed. Consequently, after transfection of CeB in Mv-1-Lu/MFGnlsLacZ (ML), TE671/MFGnlsLacZ (TEL) or HT1080 cells, blasticidin S-resistant bulk populations and most cell clones expressed high levels of gag-pol proteins assessed by the reverse-transcriptase (RT) activity found in cell supernatants (Table 2). Considerably higher RT activities were found in bulk populations of CeB-transfected ML cells compared to bulk population of ML cells stably transfected with the parental pCRIP construct. Similarly the RT activities of two packaging cell lines generated using pCRIPenv- construct, psiCRE cells (Danos et al., supra) and EB8 cells (Battini supra.) were less than that of CeB transfected clones (Table 2). Finally, RT activity in CeB transfected cell supernatants was higher than that of cells chronically infected by replication-competent MLV-A (Table 2).

                  TABLE 2                                                          ______________________________________                                         Secreted reverse transcriptase expression                                           Cell.sup.a  RT activity.sup.b                                                                             LacZ Titer.sup.c                               ______________________________________                                         ML/MLV-A     1              8 × 10.sup.4                                   MLSvB 0.1 <1                                                                   MLCRIP (bulk) 0.15 nd                                                          MLCeB (bulk) 1.7 nd                                                            MLCeB1 4.2 1 × 10.sup.6                                                  MLCeB4 1.6 1 × 10.sup.6                                                  TEL/MLV-A 3.6 2 × 10.sup.6                                               TELCeB6 5.2 4 × 10.sup.7                                                 HT1080/MLV-A 1.1 1 × 10.sup.6                                            HTCeB6 1.9 1 × 10.sup.6                                                  HTCeB18 2.7 2 × 10.sup.6                                                 HTCeB22 (FLY) 6.9 5 × 10.sup.6                                           HTCeB48 5.5 3 × 10.sup.6                                                 EB8 0.22 1 × 10.sup.4                                                    psiCRE-LLZ 1.2 1 × 10.sup.5d                                           ______________________________________                                          .sup.a ML, Mv1-Lu cells harboring a MFGnlslacZ provirus; TEL, TE671 cells      harboring a MFGnlslacZ provirus; /MLVA, cells chronically infected with        MLVA 1504 strain; MLSvB, ML cells transfected with a plasmid pSV2bsr           alone; MLCRIP, ML cells cotransfected with pCRIP and pSV2bsr.                  .sup.b Average of arbitrary units relative to ML/MLVA RT activity of at        least two independent experiments was shown. The standard errors did not       exceed 20% of the values.                                                      .sup.c titration on TE671 cells as lacZ i.u./ml. After polyclonal              transfection of a plasmid which expresses MLVA env in MLCeB clones, TELCe      clones, HTCeB clones and EB8 cells; nd, not done.                              .sup.d titration on NIH3T3 cells                                         

To rescue infectious lacZ viruses, MLCeB and TELCeB clones were transfected with FBASALF DNA, a plasmid designed to express the MLV-A env gene (FIG. 1). Bulk populations of stable FBASALF transfectants were isolated and supernatants were titrated using TE671 cells as targets. Titers of lacZ viruses were higher than either MLV-A infected ML or TEL cells, or FBASALF-transfected EB8 cells (Table 2). These data suggested that CeB was an extremely efficient MLV gag-pol expression vector in mink Mv-1-Lu and TE671 cells. CeB was therefore used to derive packaging cells by transfection of HT1080 cells. 41/49 blasticidin S-resistant colonies had detectable levels of RT; 9 had RT activity higher than that of control MLV-A-infected HT1080 cells (data not shown). Expression of gag precursor was confirmed in cell lysates and supernatants of these 9 HTCeB clones by immunoblotting using antibodies against p30-CA (data not shown). The 4 clones with the highest expression of gag proteins (clones 6,18,22 and 48) were infected at high-multiplicity with helper free, lacZ pseudotypes bearing MLV-A envelopes (MFGnlslacZ(A)) produced by TELCeB6/FBASALF (Table 3) and then transfected with FBASALF. Supernatants of bulk, phleomycin-resistant transfectants were assessed for RT activity and lacZ titer (Table 2). Clone HTCeB22, named FLY, was found to be the best gag-pol producer clone and was used to introduce env expression vectors for the generation of packaging cell lines.

                  TABLE 3                                                          ______________________________________                                         Titer following env construct transfection                                         Producer cell   Env source    Titer.sup.a                                  ______________________________________                                         psiCRIP lacZ 5  pCRIPAMgag-   6 × 10.sup.4b                                GP + EAM12 lacZ 25 envAM 3 × 10.sup.5b                                   TELCeB6 FBASALF.sup.c 5 × 10.sup.7                                        FBASAF.sup.c 2 × 10.sup.7                                                FbdelPASAF.sup.c 2 × 10.sup.7                                           TELCeB6 FBdelPASAF 1 3 × 10.sup.7                                         FbdelPASAF 4 2 × 10.sup.7                                                FbdelPASAF 6 1 × 10.sup.7                                                FbdelPASAF 7 5 × 10.sup.7                                                FbdelPASAF 8 1 × 10.sup.7                                                FbdelPRDSAF 2 1 × 10.sup.6                                               FbdelPRDSAF 4 3 × 10.sup.5                                               FbdelPRDSAF 7 1 × 10.sup.7                                               FbdelPRDSAF 8 2 × 10.sup.6                                              FLY.sup.d FBdelPASAF 1 1 × 10.sup.1                                       FbdelPASAF 4 1.5 × 10.sup.6                                              FbdelPASAF 5 1 × 10.sup.6                                                FbdelPASAF 7 1 × 10.sup.6                                                FbdelPASAF 13 7 × 10.sup.6                                               FbdelPASAF 14 4 × 10.sup.6                                               FbdelPASAF 15 1 × 10.sup.6                                               FbdelPASAF 16 5 × 10.sup.6                                               FbdelPASAF 17 6 × 10.sup.6                                              FLYA4 lacZ 3 FBdelPASAF 4 2 × 10.sup.7b                                  FLY.sup.d FBdelPRDSAF 1 2.5 × 10.sup.6                                    FbdelPRDSAF 2 1 × 10.sup.7                                               FbdelPRDSAF 6 5 × 10.sup.6                                               FbdelPRDSAF 10 2 × 10.sup.6                                              FbdelPRDSAF 11 3 × 10.sup.6                                              FbdelPRDSAF 13 1 × 10.sup.6                                              FbdelPRDSAF 17 5 × 10.sup.6                                              FbdelPRDSAF 18 3 × 10.sup.7                                              FbdelPRDSAF 19 6 × 10.sup.6                                           ______________________________________                                          Average titers of at least three independent experiments were shown. The       standard errors did not exceed 30% of the titer values.                        .sup.a titrated on TE671 cells as lacZ i.u./ml                                 .sup.b results of best MFGnlslacZ producer clones.                             .sup.c bulk populations of envtransfectants in TELCeB6 cells.                  .sup.d titration after bulk infection with helperfree MFGnlslacZ.        

EXAMPLE 8

Construction of env Expression Vectors.

A series of MLV-A env expression plasmids were then generated (FIG. 1). In FBASALF, the env gene was inserted between two Friend-MLV LTRs, its expression driven by the FB29 MLV LTR (Perryman et al., supra). Most of the packaging signal located in the leader region was deleted. This plasmid also expressed the phleo selectable marker (Gatignol et al., supra) driven by the 3' LTR. FBASAF and FBdelPASAF were then designed following the same strategy used for CeB. These two vectors differed only by the extent of deletion of the packaging signal, FBdelPASAF having virtually no leader sequence. Compared to pCRIPAMgag- and pCRIPgag-2 env plasmids expressed in psiCRIP or psiCRE packaging cells (Danos et al., supra) about 5 Kb of gag-pol sequences was removed. In addition the 258 bp retroviral sequence containing the end of env gene and the begining of U3 found in pCRIPAMgag- and pCRIPgag-2 was also removed. For both FBASAF and FBdelPASAF plasmids, the phleo selectable marker was inserted downstream of the env gene by positioning a 76 nts linker with no ATG codons between the two open-reading frames. Phleo could therefore-only be expressed by re-initiation of translation by the same ribosomal unit that had expressed the upstream env open reading frame. FBdelPASAF was also used to generate FBdelPRDSAF, an RD114 envelope expression plasmid (FIG. 1).

After transfection of the env plasmids into TELCeB6 cells (Table 2), bulk populations of phleomycin-resistant colonies were isolated and their production of lacZ virus measured (Table 3). FBASALF gave a titer of 5×10⁷ lacZ-i.u./ml, whilst titers with either FBASAF or FBdelPASAF were 2×10⁷ lacZ-i.u./ml (Table 3). Titers of 5×10⁷ or 10⁷ lacZ-i.u./ml could be obtained with some FBdelPASAF cell clones or FBdelPRDSAF clones, respectively.

As FBdelPASAF has minimal virus-derived sequences and was shown to be the safest construct (see below and Table 4), it and FBdelPRDSAF were used to generate packaging lines from FLY cells (clone HTCeB22, Table 2). Envelope expression of these clones was assayed by interference to challenge with MFGnlslacZ(A) or MFGnlslacZ(RD) pseudotypes produced by TELCeBG/FBdelPASAF-7 or TELCeB6/FBdelPRDSAF-7, respectively (Table 3). The cell lines showing most interference were is cross-infected at high multiplicity with these pseudotypes to provide MFGnlslacZ proviruses, and supernatants were then titrated on TE671 cells (Table 3). FLY-FBdelPASAF-13 (FLYA13 packaging line) and FLY-FBdelPRDSAF-18 (FLYRD18 packaging line) gave the highest productions of lacZ viruses, around 10⁷ lacZ-i.u./ml. The best MFGnlslacZ producer clones derived from either psiCRIP cells (Danos et al., supra) or GP+EAM12 cells (Markowitz et al., supra) gave approximately 50 fold lower titers (Table 3). The lacZ titers of the FLY-derived lines shown in Table 3 are lower than the best TELCeB6-derived lines after transfection of either FBdelPASAF or FBdelPRDSAF (Table 3). However it should be noted that the lacZ provirus expressed in TELCeB6 cells was obtained after clonal selection but was introduced polyclonally in FLY-derived env-transfected cell clones. When FLY-FBdelPASAF-4 cells (FLYA4 packaging line), infected with helper-free MFGnlslacZ(RD), were cloned by limiting dilution the best clones (eg. FLYA4lacZ3) were found to produce 20 times more infectious viruses than the bulk population, reaching the range of titers obtained with the best TELCeB6-FBdelPASAF clones (Table 3).

EXAMPLE 9

Assays for Transfer of gag-pol or env Functions

To assay for replication-competent viruses, supernatants were used to infect TEL cells (a clone of TE671 cells harboring an MFGnlslacZ provirus). Infected cells were passaged for 6 days or longer and their supernatants were used for infection of fresh TE671 cells. No transmission of lacZ viruses could be detected (Table 4), demonstrating that the supernatants of pCRIPAMgag-, FBASALF-, FBASAF-, or FBdelPASAF-transfected TELCeB6 cells were helper-free. Similar absence of replication competent recombinant retroviruses was demonstrated using supernatant from a clone of psiCRIP-MFGnlslacZ cells or from two clones of FLYA-MFGnlslacZ cells (Table 4).

There have been reports that helper-free retroviral vector stocks may nevertheless contain recombinant retroviruses (replication incompetent) carrying either gag-pol or env genes (Bestwick et al., Proc Natl Acad Sci USA (1988), 85, 5404-6408, Cosset et al., Virology (1993), 193, 385-395, Girod et al., Virology (1995), in press). To assay for such recombinant retroviruses, mobilisation of an MFGnlslacZ provirus from two indicator cell lines which could cross-complement potential recombinant viruses carrying either gag-pol or env functional genes was attempted. The TELCeB6 line (Table 2) expressing gag-pol proteins was used as indicator cell line to test for the presence of env recombinant (ER) viruses. The TELMOSAF indicator line expressing MoMLV env glycoproteins (obtained by transfection of FBMOSAF, a plasmid expressing the MoMLV env gene using FBASAF backbone, in TEL cells) was used to detect the presence of gag-pol recombinant retroviruses (GPR viruses). After passaging 4-8 days, the supernatants of the infected indicator cells were used to infect either human TE671 cells or murine NIH3T3 cells.

TELCe26 cells transfected with various env-expressing constructs, pCRIPAMgag-, FEASAF and FBdelPASAF were compared. Although the supernatants of TELCeB6-FBdelPASAF cells were devoid of replication-competent retroviruses, they were found sporadically to transfer gag-pol genomes (Table 4). No GPR viruses could be detected when less than 2×10⁵ virions were used to infect the indicator cells. Similarly TELCeB6 indicator cells infected with various helper-free viruses were shown sporadically to release lacZ virions (Table 4). The number depended both on the env-expression vector used and on the virus input quantity. Compared to lacZ viruses generated using pCRIPAMgag-plasmid, the frequency of detection of the env-recombinant viruses was lower for supernatants generated by using FBASAF and FBdelPASAF constructs (Table 4). For FBdelPASAF construct when less than 5×10⁵ MFGnlslacZ(A) helper-free virions were used to infect the indicator cells, no ER retroviruses could be detected. From these experiments, it could be estimated that a supernatant, produced from TELCeB6-FBdelPASAF cells, containing 1×10⁷ infectious units of MFGnlslacZ retroviral vector contained no replication-competent virus, and about 100 gag-pol and 100 env recombinant retroviruses.

                  TABLE 4                                                          ______________________________________                                         Transfer of packaging function                                                                      Input virus.sup.a                                                                        Detection.sup.b                                 Producer cell  Indicator cell                                                                           (lacZ-i.u.)                                                                              ++  +   -                                   ______________________________________                                                      Replication competent virus                                       psiCRIP lacZ 5 TEL       2 × 10.sup.4                                                                       0/4 0/4 4/4                                   TELCeB6-pCRIPAMgag- TEL 5 × 10.sup.6 0/4 0/4 4/4                         TELCeB6-FBASAF TEL 5 × 10.sup.6 0/4 0/4 4/4                              TELCeB6-FBdelPASAF TEL 5 × 10.sup.6 0/4 0/4 4/4                          FLYA4 lacZ 3 TEL 1 × 10.sup.7 0/4 0/4 4/4                                FLYA4 lacZ 7 TEL 1 × 10.sup.7 0/4 0/4 4/4                                           Gag-pol recombinant                                               TELCeB6-FBdelPASAF 7                                                                          TELMOSAF  2 × 10.sup.7                                                                       0/4 1/4 3/4                                   TELCeB6-FBdelPASAF 7 TELMOSAF 2 × 10.sup.6 0/4 2/4 2/4                   TELCeB6-FBdelPASAF 7 TELMOSAF 2 × 10.sup.5 0/4 2/4 2/4                   TELCeB6-FBdelPASAF 7 TELMOSAF 2 × 10.sup.4 0/4 0/4 4/4                              Env recombinent                                                   TELCeB6-pCRIPAMgag-                                                                           TELCeB6   5 × 10.sup.6                                                                       2/4 1/4 1/4                                   TELCeB6-pCRIPAMgag- TELCeB6 5 × 10.sup.5 1/4 1/4 2/4                     TELCeB6-pCRIPAMgag- TELCeB6 5 × 10.sup.4 0/4 2/4 2/4                     TELCeB6-FBASAF TELCeB6 5 × 10.sup.6 0/4 2/4 2/4                          TELCeB6-FBASAF TELCeB6 5 × 10.sup.5 0/4 1/4 3/4                          TELCeB6-FBASAF TELCeB6 5 × 10.sup.4 0/4 1/4 3/4                          TELCeB6-FBdelPASAF TELCeB6 5 × 10.sup.6 0/4 1/4 3/4                      TELCeB6-FBdeIPASAF TELCeB6 5 × 10.sup.5 1/4 3/4 0/4                      TELCeB6-FBdelPASAF TELCeB6 5 × 10.sup.4 0/4 0/4 4/4                    ______________________________________                                          .sup.a number of lacZ i.u. used to infect indicator cells                      .sup.b number of incidence out of four experiments. The ranges of lacZ         titers rescued from infected indicator cells are shown for each virus          input: >100 lacZ i.u./ml (++) 1-100 lacZ i.u./ml (+) and <1 lacZ i.u./ml       (-).                                                                     

Titers were determined on TE671 cells for replication competent virus and env recombinant and NIH3T3 cells for gag-pol recombinant.

EXAMPLE 10

In order to confirm resistance to complement and absence of replication competent virus in our best packaging lines, MFGnlslacZ(A) and (RD) harvested from FLYA13 and FLYRD18, respectively, after polyclonal transduction of MFGnlslacZ (Table 3 above) were tested for stability in fresh human serum and generation of replication competent virus. Titers of MFGnlslacZ(RD) from FLYRD18 after 1 hr incubation with 3 independent samples of fresh human serum were 80 to 120% of control incubations, while titers of MFGnlslacZ(A) from FLYA13 were 50 to 90% of controls (data not shown). No replication competent virus was detected in the same assay described above (Table 4) when 1×10⁷ i.u. each of MFGnlslacZ(A) and (RD) were tested.

EXAMPLE 11

Generation of Plasmids

CeB plasmid (FIG. 5) expressing MoMLV gag-pol gene, was further modified to remove the splice donor site located in the leader region. A 272 bp fragment was PCR-generated by using OUSD-(5'-TCTCGCTTCTGTTCGCGCGC SEQ ID NO: 16) and OLSD-(5'-TCGATCAAGCTTGCGGCCGCGGTGGTGGGTCGGTGGTCC SEQ ID NO: 17) as primers and further digested with BssHII and HindIII. A 1008 bp HindIII-XhoI fragment isolated from CeB (encompassing a part of leader sequence and beginning MoMLV gag) and the PCR fragment were co-inserted into pCeB from which the 1275 bp BssHII-XhoI fragment (encompassing R-U5-leader-gag) had been removed. The resulting plasmid, named pCeB DS- (FIG. 5), beared the deletion of splice donor (SD) site and a NotI restriction site created just downstream to the lost SD site.

A series of gag-pol expression plasmids in which the MoMLV LTR promoter was replaced by the human cytomegalovirus immediate early promoter (hCMV promoter) was derived from both CeB DS- and hCMV-G (Yee et al., 1994 PNAS, 91: 9564-9568), a plasmid used as a source for the hCMV promoter. A NotI-filled/EcoRI 7260 bp fragment was isolated from CeB DS- and cloned into hCMV-G which had been opened with SalI (further rendered blunt-ended) and EcoRI to remove the VSV-G gene. The resulting plasmid was cutted with ClaI and EcoRI to remove a 1155 bp fragment encompassing sequence derived from 3'-LTR and SV40 polyA sequence and self-ligated after filling both protruding DNA ends. The resulting plasmid, named phCMV-intron (FIG. 5), had gag-pol and bsr ORFs inserted between the CMV promoter and rabbit beta-globin polyA post-transcriptional regulatory sequences.

An intermediate plasmid was generated by sub-cloning a 7260 bp EcoRI fragment (isolated from CeB DS-) into hCMVG opened with EcoRI. A 1155 bp fragment (encompassing sequence derived from 3'-LTR and SV40 polyA sequence) was removed from this intermediate plasmid which was then re-circularized by self ligation after filling both ends. The resulting plasmid, named phCMV+intron 2P (Fig. 5), was digested with NotI and the vector was treated with klenow enzyme. A 1440 bp fragment (encompassing hCMV promoter and rabbit beta-1 globin intron B (Rohrbaugh et al., 1985 Mol. Cell Biol, 5: 147-160)) was isolated from phCMV+intron 2P by NotI/EcoRI digestion. This fragment was further treated with klenow enzyme and ligated back into the vector. The resulting plasmid, named hCMV+intron (FIG. 5), could express gag-pol and bsr genes driven by the hCMV promoter and beared an intron sequence derived from rabbit beta-1 globin intron B having both SD and SA (splice acceptable) sites.

A 2450 bp fragment was removed from phCMV+intron 2P by NotI/XhoI digestion. The resulting vector fragment was then used to co-ligate a 1330 bp fragment (containing hCMV promoter +5' end of rabbit beta-1 globin intron B (with SD site)) isolated from phCMVG by ApaI-filled/NotI digestion and a 1 kb fragment isolated from phCMV+intron 2P by NotI-filled/XhoI digestion. Compared to phCMV+intron 2P, the resulting plasmid, named hCMV+SD intron (FIG. 5), had the deletion of the 3' end of the rabbit beta-1 globin intron B and thus no SA site in the leader region.

Construct phCMV+leader (FIG. 5) has been described elsewhere (Savard et al., unpublished). This plasmid, in which gag-pol and bsr genes were driven by the hCMv promoter, had the MoMLV SD site in the leader region.

Gag-pol expression

The different constructs, including the parental CeB plasmid, were analysed comparatively in a complementation assay after transfection in TEL-FBdelPASAF cells expressing 4070A-MLV (amphotropic) envelope and harboring a MFGnlslacZ provirus. The transient production of lacZ retroviruses as well as the stable production of lacZ retroviral vectors after selection with blasticidin S were determined (Table 5). All the constructs were able to rescue infectious lacZ retroviruses indicating the expression of gag-pol proteins after transient transfection. Most likely due to the efficient hCMV and rabbit beta-1 globin intron B (post)-transcriptional regulatory sequences, hCMV+intron was particularly potent in transient retroviral vector production. However, 10 times less blasticidin-resistant colonies were obtained with hCMV+intron comparatively to CeB, and stable lacZ virus production from hCMV+intron was about 5-10 times lower than that of CeB. Clonal examination of lacZ retrovirus production from blasticidin-resistant colonies indicated that 80-90% of colonies could express high levels of gag-pol proteins for both hCMV+intron and CeB plasmids. In contrast, despite variation in their ability to form blasticidin-resistant colonies after transfection and despite their ability to express gag-pol proteins from transient transfectants, all other constructs had a weak capacity for rescuing lacZ retroviral vectors from stable transfectants (Table 5).

                  TABLE 5                                                          ______________________________________                                         Comparative study of gag-pol-bsr plasmids.                                                   Transient         Stable                                           gag-pol-bsr (lacZ no clones (lacZ % gag-pol/                                   plasmid i.u./ml) bsr.sup.+  i.u./ml bsr                                      ______________________________________                                         Ceb       300/ml   50         10.sup.7                                                                             90%                                          Ceb DS- 144/ml 5 10.sup.5 50%                                                  hCMV + intron ND 20 10.sup.6 50%                                               2P                                                                             hCMV - intron 812/ml 0 -- --                                                   hCMV + SD 150/ml 1000 10.sup.2 nd                                              intron                                                                         hCMV + leader 328/ml 1000 10.sup.2 -10.sup.3 nd                                hCMV + intron 12000/ml 5 10.sup.6 -10.sup.7 80%                              ______________________________________                                    

Northern blot analyses were performed on stable transfectants (blasticidin-resistant) obtained with some of the gag-pol-bsr plasmids. As expected, the results (not shown) displayed a correlation between expression of gag-pol mRNAs and gag-pol protein expression detected by rescue analysis (Table 5). CeB construct was found to produce 2-3 fold more gag-pol mRNAs compared to hCMV+intron. Interestingly, an unexpected 2.45 kb RNA band was found for hCMV+intron construct at a ratio of 2:1 compared to the abundancy of the gag-pol mRNA band (at 5.95 kb). Further investigations by using other probes revealed that a cryptic splice donnor (SD) site located in the gag gene (right in the middle of the CA coding region at position 1596-1597 --numbering according to Shinnick et al., 1981 Nature (London) 293: 543-548) was activated in this latter construct. The 2.45 RNA species, lacking the 3' half of the gag gene and most of the pol gene, is unlikely to give rise to any useful any useful translational product. It is therefore interesting to notice that hCMV+intron construct was able to give rise to slightly more transcripts (gag-pol 5.95 mRNA +2.45 alternative RNA band) compared to gag-pol mRNA expressed from CeB construct. Therefore we decided to inactivate the cryptic SD site in the hCMV+intron construct in order to increase the ratio of gag-pol mRNAs.

Assays for Transfer of gag-pol Functions

Although the supernatants of pacakaging cell lines generated with CeB gag-pol expression contruct were devoid of replication-competent retroviruses, they were found sporadically to transfer gag-pol genomes (example 9, Table 4) (Cosset et al., 1995 J. Virol 69: 7430-7436). Because gag-pol-bsr constructs generated here by using the hCMV promoter had much less retroviral sequences homologous to the retroviral vector than the parental CeB construct (FIG. 5), they are less likely to give rise to gag-pol recombinant (GPR) viruses. Therefore, the most efficient gag-pol-bsr plasmids, hCMV+intron and CeB, were further analysed for emergence of GPR viruses. To assay for such recombinant retroviruses, we attempted to mobilise an lacZ provirus from an indicator cell lines which could cross-complement potential recombinant viruses carrying gag-pol functional genes. Results displayed in Table 6 showed that consistently with data reported previously (example 9, Table 4) (Cosset et al., 1995 Supra), lacZ retrovirus vectors generated by using CeB gag-pol construct were contaminated with GPR viruses. In contrast lacZ retrovirus vectors generated by using hCMV+intron construct were completely devoid of such GPR viruses, suggesting that this construct was improved compared to CeB with respects with emergence of recombinant viruses.

                  TABLE 6                                                          ______________________________________                                         Comparative study of gag-pol-bsr plasmids.                                                      input virus                                                                             no of experiments                                      plasmid (lacZ i.u.).sup.a giving titers of.sup.b                             ______________________________________                                         CeB          5 × 10.sup.6                                                                      5          3   0                                            5 × 10.sup.5 2 4 2                                                       5 × 10.sup.4 0 1 7                                                      hCMV + intron 5 × 10.sup.6 0 0 8                                          5 × 10.sup.5 0 0 8                                                       5 × 10.sup.4 0 0 8                                                    ______________________________________                                          4 × 10E4 cells of TEL/MOSAF in 24 wells were challenged with lacZ(A      of i.u. indicated in the table (a), and incubated at 37° C. for 3       days. Cells were trypsinized and transferred into small flasks. Cell sup       was harvested on day 5 after lacZ(A) challenge and plated on either TE571      (not shown) and 3T3 cells (b). No lacZ was mobilized into TE671 at all.        LacZ(A) from CMVint 10 again did not rescue lacZ from TEL/MOSAF.         

EXAMPLE 12

Generic primers to detect D-type (Medstrand and Blomberg J. Virol. (1993) 67: 6778-6787) (SEQ ID NOS: 22 & 23), C-type (Shih et al., J Virol. (1989) 63: 64-75) (SEQ ID NOS: 20 & 21), human endogenous virus RTVL-H (Wilkinson et al., J. Virol. (1993) 67: 2981-2989) (SEQ ID NOS: 24 & 25), by RT-PCR were employed (Patience et al., supra). Primers to detect mouse endogenous VL30 element (Adams et al Mol. Cel. Biol. (1988) 8: 2989-2998) (SEQ ID NOS: 26 & 27), and MFGnlslacZ RNA (SEQ ID NOS: 18 & 19) were designed and synthesized (TABLE 7). Overnight supernatants (in 4 ml of culture medium) from 106 cells of GP+EAM121acZ25, FLYA41acZ3 and TELCeB6FBASALF cells (Table 3) were harvested and centrifuged in sucrose gradient as described previously (Patience et al., J.Virol., 70: 2654-2657). Fractions containing retrovirus particles were collected, and RNA extracted. One twentieth of the RNA preparation or dilution's thereof were applied to RT-PCR as described previously (Table 7). A 1/200 of RNA harvested from GP+EAM121acZ25 cells was positive for VL30 RNA. MFGnlslacZ RNA was found from 1/20 of RNA from GP+EAM121acZ and TELCeB6FBASALF cells and 1/200 of RNA from FLYA41acZ3 cells. The primer combinations for RTVL-H, C- and D-type RNA did not give detectable PCR product.

                                      TABLE 7                                      __________________________________________________________________________     RT-PCR detection of endogenous retrovirus RNA                                    associated with virus particles.                                                                            rt-pcr of virion associated RNA from.sup.a          primer (5'-3')             GP + EAM12                                                                            FLYA4                                                                              TELCeB6F                               RNA                   forward (F)/reverse (R) lacZ25     lacZ3                                                         BASALF                               __________________________________________________________________________     MFGnls                                                                             F) CTCTGGCTCACAGTACGACGTAG                                                                        SEQ ID NO: 18                                                                          +      ++  +                                       - lacZ     R) CCATCAATCCGGTAGGTTTTCCG SEQ ID NO: 19                            - C-type    F) CARRGKTTCAARAACWSYCCCAC   SEQ ID NO: 20 -         -                                                       -                                    -             R) AGYARVGTAGCNGGGTTHAGG SEQ ID NO: 21                           - D-type    F) TCCCCTTGGAATACTCCTGTTTTYGT SEQ ID NO: 22 -         -                                                       -                                   -                           R) CATTCCTTGTGGTAAAACTTTCCAYTG SEQ ID NO:                                                 23                                      - RTVL-H    F) CCTCACCCTGATCACRYTTG              SEQ ID NO: 24 NT                                                     -        -                              -                   R) GAATTATGTCTGACAGAAGGG SEQ ID NO: 25                     - VL30      F) GTTGACATCTGCAGAGAAAGACC   SEQ ID NO: 26  ++         NT                                                      NT                                 -                          R) TCTGAGGTCTGTACACACAATGG SEQ ID NO:            __________________________________________________________________________                                               27                                    a:-, not detected;                                                             + detected in 1/20 RNA preparation;                                            ++ detected in 1/200 RNA preparation;                                          NT, not tested because the cells do not possess the corresponding genes. 

EXAMPLE 13

Generation of gag-pol Pre-packaging Cells by Using TE671 Cells

CeB, a plasmid designed to over-express MoMLV gag and pol proteins was introduced in TE671 human rhabdomyosarcoma cells (ATCC CRL8805). After selection with blasticidin, 50 bsr-positive colonies were isolated and the RT (reverse transcriptase) activity was analysed in their supernatants. 12 TE671-CeB (TECeB) clones with high RT activity were selected for further analysis. The best TECeB clone, clone #15, had a RT activity roughly equivalent to that TELCeB6 cells (Cosset et al., J. Virol. 69: 7430-7436 (1995); see also Example 7, Table 6 in this patent application) but displayed 2-3 fold more gag-precursors into cells as demonstrated in immunoblots by using anti-CA antibodies. The biological activity of gag-pol proteins expressed in the six best TECeB clones was further confirmed by their ability to produce infectious retroviruses in a complementation assay. A lacZ provirus was introduced into each of the TECeB clones by polyclonal cross-infection by using lacZ(RD114) helper-free retrovirus vectors. FBMOSALF, a MoMLV env expression plasmid (Cosset et al., J. Virol. 69: 6314-6322), was then transfected in each of the TECeB-lacZ lines and in the TELCeB6 cell line for comparison. After selection with phleomycin, the titer of lacZ retrovirus vectors was determined in the supernantant of pools of phleomycin-resistant colonies for each TECB-lacZ-FBMOSALF lines. A good correlation was found between gag-pol expression into the TE-CeB clones (as determined by RT-assays and anti-gag immunoblots) and their ability to release infectious lacZ particles. TE-CeB15 cells could release approximately the same number of lacZ particles when compared to TELCeB6 cells although TELCeB6 cells had the advantage of being selected for lacZ expression (Cosset et al., J. Virol. 69: 7430-7436 (1995)). TE-CeB15 cells were therefore used to derive retroviral packaging cell lines.

Construction of env-expression Plasmids.

A series of plasmid (FIG. 3) was designed to allow expression of different retroviral envelope genes (isolated from MoMLV, GALV -Gibbon Ape Leukemia Virus-, and MLV-10A1) FBdelPMOSAF (FIG. 3, nucleotide sequence in FIG. 10 SEQ ID NO: 6) and FBdelP10A1SAF, expressing ecotropic MoMLV or MLV-10A1 envelopes, were generated by replacing the BglII/ClaI fragment from FBdelPASAF (Cosset et al., J. Virol. 69: 7430-7436 (1995); see also Example 7, FIG. 2 and nucleotide sequence in FIG. 9 SEQ ID NO: 5) encompassing most of the env gene and splice acceptor site with that of MoMLV (position 5407 to 7679, Shinnik et al., 1981) or with that of MLV-10A1 (Ott et al., J. Virol. 64: 757-766 (1990)).

Nucleotides 7514-7516 of GALV (Delassus et al., Virology 173: 205-213 (1989)) were mutated by PCR-mediated mutagenesis to create a ClaI site (AAG to CGA), thereby introducing a conservative modification (a lysine (amino-acid 665 of GALV env precursor) to an arginine). The BamHI/ClaI fragment (nts 4994 (Delassus et al. Virology 173: 205-213 (1989)) to 7517) was then sub-cloned into FBdelPASAF in which the EglII/ClaI encompassing most of the env gene and splice acceptor site had been removed. The resulting plasmid, expressing GALV envelope glycoproteins, was named FBdelPGASAF (FIG. 3, nucleotide sequence in FIG. 11 SEQ ID NO: 7). CMV10A1 was generated by inserting a Klenow enzyme-filled EagI/SalI fragment from FBdelPlOAlSAF (encompassing 10A1 MLV env gene and phleo selectable marker) into hCMV-G digested with BamHI and filled with Klenow enzyme. The resulting plasmid, CMV10A1 (FIG. 3 and nucleotide sequence in FIG. 13) could express 10A1 envelopes under control of the hCMV promoter and the phleo selectable marker by translation re-initiation.

Generation of a Multi-tropic Set of TE671-based Retroviral Packaging Lines

FBdelPRDSAF (FIG. 3, nucleotide sequence in FIG. 12 SEQ ID NO: 12), FBdelPASAF, FBdelPGASAF, FBdelPMOSAF and FBdelP10AlSAF were independently introduced into cells of the TE-CeB15 pre-packaging line, expressing MoMLV gag-pol proteins. Transfected cells were phleomycin-selected and 15-20 phleo-resistant colonies were isolated for each env-expression plasmid transfected.

Individual colonies were then analysed for expression of envelope glycoproteins by immunoblots on cell lysates by using antibodies against RD114 SU glycoproteins or against Rausher leukemia virus SU (to screen MoMLV, MLV-4070A and MLV-10A1 env-producer clones) or against GALV. The best env-producer colonies as determined in this assay were further analysed by a complementation assay after introducing a lacZ retroviral vector. LacZ pseudotypes released from the different packaging cell lines were titrated by using NIH 3T3 cells or TE671 cells as target. Titers higher than 1×10⁷ lacZ i.u./ml were obtained for the best clones. Depending on the envelope specificities expressed in these cells, the new TE671-based retroviral packaging cell lines were named TE-FLYE, TE-FLYA, TE-FLYRD, TE-FLY10A1, and TE-FLYGA and could express the MoMLV, MLV-4070A, RD114, MLV-10A1, and GALV env genes, respectively.

Assays for detecting replication-competent retroviruses (RCRs) were performed in the supernatants of these cells and were negative (less than 1/ml).

TE671 cells are very potent for transient expression resulting in more than 95% of cells expressing transgene three days after plasmid transfection (Hatziioannou and Cosset, unpublished data, (1996)). The ability of retroviral packaging cell lines to transiently produce retroviral vectors is of crucial importance for gene therapy where vectors carrying toxic gene have to be prepared. Transient expression of retroviral vectors was comparatively determined from cells of the TE-FLYA line and from the BING line (Pear et al., Proc Natl Acad Sci U S A 90, 8392-6 (1993)), a retroviral packaging cell line designed to transiently express retroviral vectors. Results (Table 8) showed that-TE-FLYA cells were more efficient for transient expression of a lacZ retroviral vector hence resulting in higher titers.

                  TABLE 8                                                          ______________________________________                                         Comparative study of transient production of lacZ                                vectors.                                                                         packaging           % transfected                                                                              transient                                    cell line cell number.sup.a cells.sup.b titer.sup.c                          ______________________________________                                         BING      281       5.3           2 × 10.sup.2                             TE-FLYA 117 35 1.3 × 10.sup.3                                          ______________________________________                                          Cells were transfected by MFGnlslacZ retroviral vectors with calcium           phosphate precipitation method and titers of of lacZ vectors (c) released      in cell supernatant were determined as lacZ i.u./ml at day 3 following         transfection. The relative number of cells (a) (average per microscope         field) and the % of transfected cells (b) determined after Xgal staining       are shown.                                                               

Retroviral vectors prepared from TE671-based packaging cell lines were analysed for their sensitivity to human-complement mediated inactivation. Experiments were conducted as previously described (Cosset et al., J. Virol. 69: 7430-7436 (1995); see also Example 10 in this patent application) by using three human sera of individual donnors (Table 9). As expected MLV-A prepared from mouse 3T3 cells were highly sensitive to inactivation after 1 hr incubation witn sera. In contrast, titers of lacZ vectors produced from TE-FLYRD cells were 17 to 55% of control incubations, while titers of lacZ vectors from TE-FLYA cells were 1 to 30% of controls.

                  TABLE 9                                                          ______________________________________                                         Human serum sensitivity oif viruses produced from                                TE671-based packaging cell lines.                                                Virus from:                                                                              hu56.sup.a   hu57.sup.a                                                                            BTS.sup.a                                    ______________________________________                                         3T3/A     <0.2, <0.2   <0.2, <0.2                                                                              <0.2, <0.2                                       TE-FLYE  15, 7.8 16, 11 48, 60                                                 TE-FLYA   1, 0.6 2.2, 7.1 28, 19                                               TE-FLYRD 17, 22 30, 44 54, 63                                                ______________________________________                                          Three human fresh serum samples were tested in duplicate; hu56 (A+), hu57      (AB+), BTS (AB+). (a) % control (average for FCS and optiMEM treatment) i      shown.                                                                   

    __________________________________________________________________________     #             SEQUENCE LISTING                                                    - -  - - <160> NUMBER OF SEQ ID NOS: 29                                        - - <210> SEQ ID NO 1                                                         <211> LENGTH: 2518                                                             <212> TYPE: DNA                                                                <213> ORGANISM: RD114                                                          <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (1)                                                            <223> OTHER INFORMATION: n is any nucleotide                                    - - <400> SEQUENCE: 1                                                          - - ngagctcagg acaggtagaa agaatgaata gaacaataaa agagaccctt ac -             #taaattga     60                                                                  - - ccttagagac tggcttaaaa gattggagac gcctcctatc tctggctttg tt -             #aagagcca    120                                                                  - - gaaatacgcc caaccgtttt cggctcaccc catatgaaat cctttatggg gg -             #accccccc    180                                                                  - - ctttgtcaac cttgctcaat tccttctccc cctccgatcc taagactgat tt -             #acaagccc    240                                                                  - - gactaaaagg gctgcaaggc gtgcaggccc aaatctggac acccctggcc ga -             #attgtacc    300                                                                  - - ggccaggaca tccacaaact agccacccat ttcaggtggg agactccgtg ta -             #cgtccggc    360                                                                  - - ggcaccgctc tcaaggattg gagcctcgtt ggaagggacc ttacatcgtc ct -             #gctgacca    420                                                                  - - cgcccaccgc cataaaggtt gacgggatcg ccgcctggat tcacgcatcg ca -             #cgccaagg    480                                                                  - - cagccccaaa aacccctgga ccagaaactc ccaaaacctg gaagctccgc cg -             #ttcggaga    540                                                                  - - accctcttaa gataagactc tcccgtgtct gactgctaat ccaccttgtc cc -             #tgtactaa    600                                                                  - - cccaaaatga aactcccaac aggaatggtc attttatgta gcctaataat ag -             #ttcgggca    660                                                                  - - gggtttgacg acccccgcaa ggctatcgca ttagtacaaa aacaacatgg ta -             #aaccatgc    720                                                                  - - gaatgcagcg gagggcaggt atccgaggcc ccaccgaact ccatccaaca gg -             #taacttgc    780                                                                  - - ccaggcaaga cggcctactt aatgaccaac caaaaatgga aatgcagagt ca -             #ctccaaaa    840                                                                  - - atctcaccta gcgggggaga actccagaac tgcccctgta acactttcca gg -             #actcgatg    900                                                                  - - cacagttctt gttatactga ataccggcaa tgcaggcgaa ttaataagac at -             #actacacg    960                                                                  - - gccaccttgc ttaaaatacg gtctgggagc ctcaacgagg tacagatatt ac -             #aaaacccc   1020                                                                  - - aatcagctcc tacagtcccc ttgtaggggc tctataaatc agcccgtttg ct -             #ggagtgcc   1080                                                                  - - acagccccca tccatatctc cgatggtgga ggacccctcg atactaagag ag -             #tgtggaca   1140                                                                  - - gtccaaaaaa ggctagaaca aattcataag gctatgactc ctgaacttca at -             #accacccc   1200                                                                  - - ttagccctgc ccaaagtcag agatgacctt agccttgatg cacggacttt tg -             #atatcctg   1260                                                                  - - aataccactt ttaggttact ccagatgtcc aattttagcc ttgcccaaga tt -             #gttggctc   1320                                                                  - - tgtttaaaac taggtacccc tacccctctt gcgataccca ctccctcttt aa -             #cctactcc   1380                                                                  - - ctagcagact ccctagcgaa tgcctcctgt cagattatac ctcccctctt gg -             #ttcaaccg   1440                                                                  - - atgcagttct ccaactcgtc ctgtttatct tcccctttca ttaacgatac gg -             #aacaaata   1500                                                                  - - gacttaggtg cagtcacctt tactaactgc acctctgtag ccaatgtcag ta -             #gtccttta   1560                                                                  - - tgtgccctaa acgggtcagt cttcctctgt ggaaataaca tggcatacac ct -             #atttaccc   1620                                                                  - - caaaactgga ccagactttg cgtccaagcc tccctcctcc ccgacattga ca -             #tcaacccg   1680                                                                  - - ggggatgagc cagtccccat tcctgccatt gatcattata tacatagacc ta -             #aacgagct   1740                                                                  - - gtacagttca tccctttact agctggactg ggaatcaccg cagcattcac ca -             #ccggagct   1800                                                                  - - acaggcctag gtgtctccgt cacccagtat acaaaattat cccatcagtt aa -             #tatctgat   1860                                                                  - - gtccaagtct tatccggtac catacaagat ttacaagacc aggtagactc gt -             #tagctgaa   1920                                                                  - - gtagttctcc aaaataggag gggactggac ctactaacgg cagaacaagg ag -             #gaatttgt   1980                                                                  - - ttagccttac aagaaaaatg ctgtttttat gctaacaagt caggaattgt ga -             #gaaacaaa   2040                                                                  - - ataagaaccc tacaagaaga attacaaaaa cgcagggaaa gcctggcaac ca -             #accctctc   2100                                                                  - - tggaccgggc tgcagggctt tcttccgtac ctcctacctc tcctgggacc cc -             #tactcacc   2160                                                                  - - ctcctactca tactaaccat tgggccatgc gttttcagtc gcctcatggc ct -             #tcattaat   2220                                                                  - - gatagactta atgttgtaca tgccatggtg ctggcccagc aataccaagc ac -             #tcaaagct   2280                                                                  - - gaggaagaag ctcaggattg agcttccggg acaaaagcag gggggaatga ga -             #agtcagaa   2340                                                                  - - ccccccacct ttgctacata aataaccgct ttcatttcgc ttctgtaaaa cg -             #cttatgcg   2400                                                                  - - ccccacccta gccggaaagt ccccagccgc tacgcaaccc gggccccgag tt -             #gcatcagc   2460                                                                  - - cgttcgcaac ccgggctccg agttgcatca gccgaaagaa acttcatttc cc -             #aagctt     2518                                                                  - -  - - <210> SEQ ID NO 2                                                    <211> LENGTH: 7616                                                             <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Portion      of                                                                                    construct                                                                 - - <400> SEQUENCE: 2                                                          - - aatgaaagac cccacctgta ggtttggcaa gctagcttaa gtaacgccat tt -              #tgcaaggc     60                                                                  - - atggaaaaat acataactga gaatagagaa gttcagatca aggtcaggaa ca -             #gatggaac    120                                                                  - - agctgaatat gggccaaaca ggatatctgt ggtaagcagt tcctgccccg gc -             #tcagggcc    180                                                                  - - aagaacagat ggaacagctg aatatgggcc aaacaggata tctgtggtaa gc -             #agttcctg    240                                                                  - - ccccggctca gggccaagaa cagatggtcc ccagatgcgg tccagccctc ag -             #cagtttct    300                                                                  - - agagaaccat cagatgtttc cagggtgccc caaggacctg aaatgaccct gt -             #gccttatt    360                                                                  - - tgaactaacc aatcagttcg cttctcgctt ctgttcgcgc gcttctgctc cc -             #cgagctca    420                                                                  - - ataaaagagc ccacaacccc tcactcgggg cgccagtcct ccgattgact ga -             #gtcgcccg    480                                                                  - - ggtacccgtg tatccaataa accctcttgc agttgcatcc gacttgtggt ct -             #cgctgttc    540                                                                  - - cttgggaggg tctcctctga gtgattgact acccgtcagc gggggtcttt ca -             #tttggggg    600                                                                  - - ctcgtccggg atcgggagac ccctgcccag ggaccaccga cccaccaccg gg -             #aggtaagc    660                                                                  - - tggaagcttc tgcagcatcg ttctgtgttg tctctgtctg actgtgtttc tg -             #tatttgtc    720                                                                  - - tgagaatatg ggccagactg ttaccactcc cttaagtttg accttaggtc ac -             #tggaaaga    780                                                                  - - tgtcgagcgg atcgctcaca accagtcggt agatgtcaag aagagacgtt gg -             #gttacctt    840                                                                  - - ctgctctgca gaatggccaa cctttaacgt cggatggccg cgagacggca cc -             #tttaaccg    900                                                                  - - agacctcatc acccaggtta agatcaaggt cttttcacct ggcccgcatg ga -             #cacccaga    960                                                                  - - ccaggtcccc tacatcgtga cctgggaagc cttggctttt gacccccctc cc -             #tgggtcaa   1020                                                                  - - gccctttgta caccctaagc ctccgcctcc tcttcctcca tccgccccgt ct -             #ctccccct   1080                                                                  - - tgaacctcct cgttcgaccc cgcctcgatc ctccctttat ccagccctca ct -             #ccttctct   1140                                                                  - - aggcgccaaa cctaaacctc aagttctttc tgacagtggg gggccgctca tc -             #gacctact   1200                                                                  - - tacagaagac cccccgcctt atagggaccc aagaccaccc ccttccgaca gg -             #gacggaaa   1260                                                                  - - tggtggagaa gcgacccctg cgggagaggc accggacccc tccccaatgg ca -             #tctcgcct   1320                                                                  - - acgtgggaga cgggagcccc ctgtggccga ctccactacc tcgcaggcat tc -             #cccctccg   1380                                                                  - - cgcaggagga aacggacagc ttcaatactg gccgttctcc tcttctgacc tt -             #tacaactg   1440                                                                  - - gaaaaataat aacccttctt tttctgaaga tccaggtaaa ctgacagctc tg -             #atcgagtc   1500                                                                  - - tgttctcatc acccatcagc ccacctggga cgactgtcag cagctgttgg gg -             #actctgct   1560                                                                  - - gaccggagaa gaaaaacaac gggtgctctt agaggctaga aaggcggtgc gg -             #ggcgatga   1620                                                                  - - tgggcgcccc actcaactgc ccaatgaagt cgatgccgct tttcccctcg ag -             #cgcccaga   1680                                                                  - - ctgggattac accacccagg caggtaggaa ccacctagtc cactatcgcc ag -             #ttgctcct   1740                                                                  - - agcgggtctc caaaacgcgg gcagaagccc caccaatttg gccaaggtaa aa -             #ggaataac   1800                                                                  - - acaagggccc aatgagtctc cctcggcctt cctagagaga cttaaggaag cc -             #tatcgcag   1860                                                                  - - gtacactcct tatgaccctg aggacccagg gcaagaaact aatgtgtcta tg -             #tctttcat   1920                                                                  - - ttggcagtct gccccagaca ttgggagaaa gttagagagg ttagaagatt ta -             #aaaaacaa   1980                                                                  - - gacgcttgga gatttggtta gagaggcaga aaagatcttt aataaacgag aa -             #accccgga   2040                                                                  - - agaaagagag gaacgtatca ggagagaaac agaggaaaaa gaagaacgcc gt -             #aggacaga   2100                                                                  - - ggatgagcag aaagagaaag aaagagatcg taggagacat agagagatga gc -             #aagctatt   2160                                                                  - - ggccactgtc gttagtggac agaaacagga tagacaggga ggagaacgaa gg -             #aggtccca   2220                                                                  - - actcgatcgc gaccagtgtg cctactgcaa agaaaagggg cactgggcta aa -             #gattgtcc   2280                                                                  - - caagaaacca cgaggacctc ggggaccaag accccagacc tccctcctga cc -             #ctagatga   2340                                                                  - - ctagggaggt cagggtcagg agcccccccc tgaacccagg ataaccctca aa -             #gtcggggg   2400                                                                  - - gcaacccgtc accttcctgg tagatactgg ggcccaacac tccgtgctga cc -             #caaaatcc   2460                                                                  - - tggaccccta agtgataagt ctgcctgggt ccaaggggct actggaggaa ag -             #cggtatcg   2520                                                                  - - ctggaccacg gatcgcaaag tacatctagc taccggtaag gtcacccact ct -             #ttcctcca   2580                                                                  - - tgtaccagac tgtccctatc ctctgttagg aagagatttg ctgactaaac ta -             #aaagccca   2640                                                                  - - aatccacttt gagggatcag gagctcaggt tatgggacca atggggcagc cc -             #ctgcaagt   2700                                                                  - - gttgacccta aatatagaag atgagcatcg gctacatgag acctcaaaag ag -             #ccagatgt   2760                                                                  - - ttctctaggg tccacatggc tgtctgattt tcctcaggcc tgggcggaaa cc -             #gggggcat   2820                                                                  - - gggactggca gttcgccaag ctcctctgat catacctctg aaagcaacct ct -             #acccccgt   2880                                                                  - - gtccataaaa caatacccca tgtcacaaga agccagactg gggatcaagc cc -             #cacataca   2940                                                                  - - gagactgttg gaccagggaa tactggtacc ctgccagtcc ccctggaaca cg -             #cccctgct   3000                                                                  - - acccgttaag aaaccaggga ctaatgatta taggcctgtc caggatctga ga -             #gaagtcaa   3060                                                                  - - caagcgggtg gaagacatcc accccaccgt gcccaaccct tacaacctct tg -             #agcgggct   3120                                                                  - - cccaccgtcc caccagtggt acactgtgct tgatttaaag gatgcctttt tc -             #tgcctgag   3180                                                                  - - actccacccc accagtcagc ctctcttcgc ctttgagtgg agagatccag ag -             #atgggaat   3240                                                                  - - ctcaggacaa ttgacctgga ccagactccc acagggtttc aaaaacagtc cc -             #accctgtt   3300                                                                  - - tgatgaggca ctgcacagag acctagcaga cttccggatc cagcacccag ac -             #ttgatcct   3360                                                                  - - gctacagtac gtggatgact tactgctggc cgccacttct gagctagact gc -             #caacaagg   3420                                                                  - - tactcgggcc ctgttacaaa ccctagggaa cctcgggtat cgggcctcgg cc -             #aagaaagc   3480                                                                  - - ccaaatttgc cagaaacagg tcaagtatct ggggtatctt ctaaaagagg gt -             #cagagatg   3540                                                                  - - gctgactgag gccagaaaag agactgtgat ggggcagcct actccgaaga cc -             #cctcgaca   3600                                                                  - - actaagggag ttcctaggga cggcaggctt ctgtcgcctc tggatccctg gg -             #tttgcaga   3660                                                                  - - aatggcagcc cccttgtacc ctctcaccaa aacggggact ctgtttaatt gg -             #ggcccaga   3720                                                                  - - ccaacaaaag gcctatcaag aaatcaagca agctcttcta actgccccag cc -             #ctggggtt   3780                                                                  - - gccagatttg actaagccct ttgaactctt tgtcgacgag aagcagggct ac -             #gccaaagg   3840                                                                  - - tgtcctaacg caaaaactgg gaccttggcg tcggccggtg gcctacctgt cc -             #aaaaagct   3900                                                                  - - agacccagta gcagctgggt ggcccccttg cctacggatg gtagcagcca tt -             #gccgtact   3960                                                                  - - gacaaaggat gcaggcaagc taaccatggg acagccacta gtcattctgg cc -             #ccccatgc   4020                                                                  - - agtagaggca ctagtcaaac aaccccccga ccgctggctt tccaacgccc gg -             #atgactca   4080                                                                  - - ctatcaggcc ttgcttttgg acacggaccg ggtccagttc ggaccggtgg ta -             #gccctgaa   4140                                                                  - - cccggctacg ctgctcccac tgcctgagga agggctgcaa cacaactgcc tt -             #gatatcct   4200                                                                  - - ggccgaagcc cacggaaccc gacccgacct aacggaccag ccgctcccag ac -             #gccgacca   4260                                                                  - - cacctggtac acggatggaa gcagtctctt acaagaggga cagcgtaagg cg -             #ggagctgc   4320                                                                  - - ggtgaccacc gagaccgagg taatctgggc taaagccctg ccagccggga ca -             #tccgctca   4380                                                                  - - gcgggctgaa ctgatagcac tcacccaggc cctaaagatg gcagaaggta ag -             #aagctaaa   4440                                                                  - - tgtttatact gatagccgtt atgcttttgc tactgcccat atccatggag aa -             #atatacag   4500                                                                  - - aaggcgtggg ttgctcacat cagaaggcaa agagatcaaa aataaagacg ag -             #atcttggc   4560                                                                  - - cctactaaaa gccctctttc tgcccaaaag acttagcata atccattgtc ca -             #ggacatca   4620                                                                  - - aaagggacac agcgccgagg ctagaggcaa ccggatggct gaccaagcgg cc -             #cgaaaggc   4680                                                                  - - agccatcaca gagactccag acacctctac cctcctcata gaaaattcat ca -             #ccctacac   4740                                                                  - - ctcagaacat tttcattaca cagtgactga tataaaggac ctaaccaagt tg -             #ggggccat   4800                                                                  - - ttatgataaa acaaagaagt attgggtcta ccaaggaaaa cctgtgatgc ct -             #gaccagtt   4860                                                                  - - tacttttgaa ttattagact ttcttcatca gctgactcac ctcagcttct ca -             #aaaatgaa   4920                                                                  - - ggctctccta gagagaagcc acagtcccta ctacatgctg aaccgggatc ga -             #acactcaa   4980                                                                  - - aaatatcact gagacctgca aagcttgtgc acaagtcaac gccagcaagt ct -             #gccgttaa   5040                                                                  - - acagggaact agggtccgcg ggcatcggcc cggcactcat tgggagatcg at -             #ttcaccga   5100                                                                  - - gataaagccc ggattgtatg gctataaata tcttctagtt tttatagata cc -             #ttttctgg   5160                                                                  - - ctggatagaa gccttcccaa ccaagaaaga aaccgccaag gtcgtaacca ag -             #aagctact   5220                                                                  - - agaggagatc ttccccaggt tcggcatgcc tcaggtattg ggaactgaca at -             #gggcctgc   5280                                                                  - - cttcgtctcc aaggtgagtc agacagtggc cgatctgttg gggattgatt gg -             #aaattaca   5340                                                                  - - ttgtgcatac agaccccaaa gctcaggcca ggtagaaaga atgaatagaa cc -             #atcaagga   5400                                                                  - - gactttaact aaattaacgc ttgcaactgg ctctagagac tgggtgctcc ta -             #ctcccctt   5460                                                                  - - agccctgtac cgagcccgca acacgccggg cccccatggc ctcaccccat at -             #gagatctt   5520                                                                  - - atatggggca cccccgcccc ttgtaaactt ccctgaccct gacatgacaa ga -             #gttactaa   5580                                                                  - - cagcccctct ctccaagctc acttacaggc tctctactta gtccagcacg aa -             #gtctggag   5640                                                                  - - acctctggcg gcagcctacc aagaacaact ggaccgaccg gtggtacctc ac -             #ccttaccg   5700                                                                  - - agtcggcgac acagtgtggg tccgccgaca ccagactaag aacctagaac ct -             #cgctggaa   5760                                                                  - - aggaccttac acagtcctgc tgaccacccc caccgccctc aaagtagacg gc -             #atcgcagc   5820                                                                  - - ttggatacac gccgcccacg tgaaggctgc cgaccccggg ggtggaccat cc -             #tctagact   5880                                                                  - - gacatggcgc gttcaacgct ctcaaaaccc cttaaaaata aggttaaccc gc -             #gaggcccc   5940                                                                  - - ctaatcccct taattcttct gatgctcaga ggggtcagta ctgcttcgcc cg -             #gctccagt   6000                                                                  - - gcggcccagc cggccaccat gaaaacattt aacatttctc aacaagatct ag -             #aattagta   6060                                                                  - - gaagtagcga cagagaagat tacaatgctt tatgaggata ataaacatca tg -             #tgggagcg   6120                                                                  - - gcaattcgta cgaaaacagg agaaatcatt tcggcagtac atattgaagc gt -             #atatagga   6180                                                                  - - cgagtaactg tttgtgcaga agccattgcg attggtagtg cagtttcgaa tg -             #gacaaaag   6240                                                                  - - gattttgaca cgattgtagc tgttagacac ccttattctg acgaagtaga ta -             #gaagtatt   6300                                                                  - - cgagtggtaa gtccttgtgg tatgtgtagg gagttgattt cagactatgc ac -             #cagattgt   6360                                                                  - - tttgtgttaa tagaaatgaa tggcaagtta gtcaaaacta cgattgaaga ac -             #tcattcca   6420                                                                  - - ctcaaatata cccgaaatta aaagttttac caccaagctt atcgattagt cc -             #aatttgtt   6480                                                                  - - aaagacagga tatcagtggt ccaggctcta gttttgactc aacaatatca cc -             #agctgaag   6540                                                                  - - cctatagagt acgagccata gataaaataa aagattttat ttagtctcca ga -             #aaaagggg   6600                                                                  - - ggaatgaaag accccacctg taggtttggc aagctagctt aagtaacgcc at -             #tttgcaag   6660                                                                  - - gcatggaaaa atacataact gagaatagag aagttcagat caaggtcagg aa -             #cagatgga   6720                                                                  - - acagtcgaga acttgtttat tgcagcttat aatggttaca aataaagcaa ta -             #gcatcaca   6780                                                                  - - aatttcacaa ataaagcatt tttttcactg cattctagtt gtggtttgtc ca -             #aactcatc   6840                                                                  - - aatgtatctt atcatgtctg gatccccagg aagctcctct gtgtcctcat aa -             #accctaac   6900                                                                  - - ctcctctact tgagaggaca ttccaatcat aggctgccca tccaccctct gt -             #gtcctcct   6960                                                                  - - gttaattagg tcacttaaca aaaaggaaat tgggtagggg tttttcacag ac -             #cgctttct   7020                                                                  - - aagggtaatt ttaaaatatc tgggaagtcc cttccactgc tgtgttccag aa -             #gtgttggt   7080                                                                  - - aaacagccca caaatgtcaa cagcagaaac atacaagctg tcagctttgc ac -             #aagggccc   7140                                                                  - - aacaccctgc tcatcaagaa gcactgtggt tgctgtgtta gtaatgtgca aa -             #acaggagg   7200                                                                  - - cacattttcc ccacctgtgt aggttccaaa atatctagtg ttttcatttt ta -             #cttggatc   7260                                                                  - - aggaacccag cactccactg gataagcatt atccttatcc aaaacagcct tg -             #tggtcagt   7320                                                                  - - gttcatctgc tgactgtcaa ctgtagcatt ttttggggtt acagtttgag ca -             #ggatattt   7380                                                                  - - ggtcctgtag tttgctaaca caccctgcag ctccaaaggt tccccaccaa ca -             #gcaaaaaa   7440                                                                  - - atgaaaattt gacccttgaa tgggttttcc agcaccattt tcatgagttt tt -             #tgtgtccc   7500                                                                  - - tgaatgcaag tttaacatag cagttacccc aataacctca gttttaacag ta -             #acagcttc   7560                                                                  - - ccacatcaaa atatttccac aggttaagtc ctcatttaaa ttaggcaaag ga - #attc            7616                                                                        - -  - - <210> SEQ ID NO 3                                                    <211> LENGTH: 7308                                                             <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Portion       of                                                                                     construct                                                                 - - <400> SEQUENCE: 3                                                          - - agatctcccg atcccctatg gtcgactctc agtacaatct gctctgatgc cg -             #catagtta     60                                                                  - - agccagtatc tgctccctgc ttgtgtgttg gaggtcgctg agtagtgcgc ga -             #gcaaaatt    120                                                                  - - taagctacaa caaggcaagg cttgaccgac aattgcatga agaatctgct ta -             #gggttagg    180                                                                  - - cgttttgcgc tgcttcgcga tgtacgggcc agatatacgc gttgacattg at -             #tattgact    240                                                                  - - agttattaat agtaatcaat tacggggtca ttagttcata gcccatatat gg -             #agttccgc    300                                                                  - - gttacataac ttacggtaaa tggcccgcct ggctgaccgc ccaacgaccc cc -             #gcccattg    360                                                                  - - acgtcaataa tgacgtatgt tcccatagta acgccaatag ggactttcca tt -             #gacgtcaa    420                                                                  - - tgggtggact atttacggta aactgcccac ttggcagtac atcaagtgta tc -             #atatgcca    480                                                                  - - agtacgcccc ctattgacgt caatgacggt aaatggcccg cctggcatta tg -             #cccagtac    540                                                                  - - atgaccttat gggactttcc tacttggcag tacatctacg tattagtcat cg -             #ctattacc    600                                                                  - - atggtgatgc ggttttggca gtacatcaat gggcgtggat agcggtttga ct -             #cacgggga    660                                                                  - - tttccaagtc tccaccccat tgacgtcaat gggagtttgt tttggcacca aa -             #atcaacgg    720                                                                  - - gactttccaa aatgtcgtaa caactccgcc ccattgacgc aaatgggcgg ta -             #ggcgtgta    780                                                                  - - cggtgggagg tctatataag cagagctctc tggctaacta gagaacccac tg -             #cttaactg    840                                                                  - - gcttatcgaa atgtcgactg agaacttcag ggtgagtttg gggacccttg at -             #tgttcttt    900                                                                  - - ctttttcgct attgtaaaat tcatgttata tggagggggc aaagttttca gg -             #gtgttgtt    960                                                                  - - tagaatggga agatgtccct tgtatcacca tggaccctca tgataatttt gt -             #ttctttca   1020                                                                  - - ctttctactc tgttgacaac cattgtctcc tcttattttc ttttcatttt ct -             #gtaacttt   1080                                                                  - - ttcgttaaac tttagcttgc atttgtaacg aatttttaaa ttcacttttg tt -             #tatttgtc   1140                                                                  - - agattgtaag tactttctct aatcactttt ttttcaaggc aatcagggta ta -             #ttatattg   1200                                                                  - - tacttcagca cagttttaga gaacaattgt tataattaaa tgataaggta ga -             #atatttct   1260                                                                  - - gcatataaat tctggctggc gtggaaatat tcttattggt agaaacaact ac -             #atcctggt   1320                                                                  - - catcatcctg cctttctctt tatggttaca atgatataca ctgtttgaga tg -             #aggataaa   1380                                                                  - - atactctgag tccaaaccgg gcccctctgc taaccatgtt catgccttct tc -             #tttttcct   1440                                                                  - - acagctcctg ggcaacgtgc tggttgttgt gctgtctcat cattttggca ag -             #aattggcc   1500                                                                  - - gcaagcttct gcagcatcgt tctgtgttgt ctctgtctga ctgtgtttct gt -             #atttgtct   1560                                                                  - - gagaatatgg gccagactgt taccactccc ttaagtttga ccttaggtca ct -             #ggaaagat   1620                                                                  - - gtcgagcgga tcgctcacaa ccagtcggta gatgtcaaga agagacgttg gg -             #ttaccttc   1680                                                                  - - tgctctgcag aatggccaac ctttaacgtc ggatggccgc gagacggcac ct -             #ttaaccga   1740                                                                  - - gacctcatca cccaggttaa gatcaaggtc ttttcacctg gcccgcatgg ac -             #acccagac   1800                                                                  - - caggtcccct acatcgtgac ctgggaagcc ttggcttttg acccccctcc ct -             #gggtcaag   1860                                                                  - - ccctttgtac accctaagcc tccgcctcct cttcctccat ccgccccgtc tc -             #tccccctt   1920                                                                  - - gaacctcctc gttcgacccc gcctcgatcc tccctttatc cagccctcac tc -             #cttctcta   1980                                                                  - - ggcgccaaac ctaaacctca agttctttct gacagtgggg ggccgctcat cg -             #acctactt   2040                                                                  - - acagaagacc ccccgcctta tagggaccca agaccacccc cttccgacag gg -             #acggaaat   2100                                                                  - - ggtggagaag cgacccctgc gggagaggca ccggacccct ccccaatggc at -             #ctcgccta   2160                                                                  - - cgtgggagac gggagccccc tgtggccgac tccactacct cgcaggcatt cc -             #ccctccgc   2220                                                                  - - gcaggaggaa acggacagct tcaatactgg ccgttctcct cttctgacct tt -             #acaactgg   2280                                                                  - - aaaaataata acccttcttt ttctgaagat ccaggtaaac tgacagctct ga -             #tcgagtct   2340                                                                  - - gttctcatca cccatcagcc cacctgggac gactgtcagc agctgttggg ga -             #ctctgctg   2400                                                                  - - accggagaag aaaaacaacg ggtgctctta gaggctagaa aggcggtgcg gg -             #gcgatgat   2460                                                                  - - gggcgcccca ctcaactgcc caatgaagtc gatgccgctt ttcccctcga gc -             #gcccagac   2520                                                                  - - tgggattaca ccacccaggc aggtaggaac cacctagtcc actatcgcca gt -             #tgctccta   2580                                                                  - - gcgggtctcc aaaacgcggg cagaagcccc accaatttgg ccaaggtaaa ag -             #gaataaca   2640                                                                  - - caagggccca atgagtctcc ctcggccttc ctagagagac ttaaggaagc ct -             #atcgcagg   2700                                                                  - - tacactcctt atgaccctga ggacccaggg caagaaacta atgtgtctat gt -             #ctttcatt   2760                                                                  - - tggcagtctg ccccagacat tgggagaaag ttagagaggt tagaagattt aa -             #aaaacaag   2820                                                                  - - acgcttggag atttggttag agaggcagaa aagatcttta ataaacgaga aa -             #ccccggaa   2880                                                                  - - gaaagagagg aacgtatcag gagagaaaca gaggaaaaag aagaacgccg ta -             #ggacagag   2940                                                                  - - gatgagcaga aagagaaaga aagagatcgt aggagacata gagagatgag ca -             #agctattg   3000                                                                  - - gccactgtcg ttagtggaca gaaacaggat agacagggag gagaacgaag ga -             #ggtcccaa   3060                                                                  - - ctcgatcgcg accagtgtgc ctactgcaaa gaaaaggggc actgggctaa ag -             #attgtccc   3120                                                                  - - aagaaaccac gaggacctcg gggaccaaga ccccagacct ccctcctgac cc -             #tagatgac   3180                                                                  - - tagggaggtc agggtcagga gcccccccct gaacccagga taaccctcaa ag -             #tcgggggg   3240                                                                  - - caacccgtca ccttcctggt agatactggg gcccaacact ccgtgctgac cc -             #aaaatcct   3300                                                                  - - ggacccctaa gtgataagtc tgcctgggtc caaggggcta ctggaggaaa gc -             #ggtatcgc   3360                                                                  - - tggaccacgg atcgcaaagt acatctagct accggtaagg tcacccactc tt -             #tcctccat   3420                                                                  - - gtaccagact gtccctatcc tctgttagga agagatttgc tgactaaact aa -             #aagcccaa   3480                                                                  - - atccactttg agggatcagg agctcaggtt atgggaccaa tggggcagcc cc -             #tgcaagtg   3540                                                                  - - ttgaccctaa atatagaaga tgagcatcgg ctacatgaga cctcaaaaga gc -             #cagatgtt   3600                                                                  - - tctctagggt ccacatggct gtctgatttt cctcaggcct gggcggaaac cg -             #ggggcatg   3660                                                                  - - ggactggcag ttcgccaagc tcctctgatc atacctctga aagcaacctc ta -             #cccccgtg   3720                                                                  - - tccataaaac aataccccat gtcacaagaa gccagactgg ggatcaagcc cc -             #acatacag   3780                                                                  - - agactgttgg accagggaat actggtaccc tgccagtccc cctggaacac gc -             #ccctgcta   3840                                                                  - - cccgttaaga aaccagggac taatgattat aggcctgtcc aggatctgag ag -             #aagtcaac   3900                                                                  - - aagcgggtgg aagacatcca ccccaccgtg cccaaccctt acaacctctt ga -             #gcgggctc   3960                                                                  - - ccaccgtccc accagtggta cactgtgctt gatttaaagg atgccttttt ct -             #gcctgaga   4020                                                                  - - ctccacccca ccagtcagcc tctcttcgcc tttgagtgga gagatccaga ga -             #tgggaatc   4080                                                                  - - tcaggacaat tgacctggac cagactccca cagggtttca aaaacagtcc ca -             #ccctgttt   4140                                                                  - - gatgaggcac tgcacagaga cctagcagac ttccggatcc agcacccaga ct -             #tgatcctg   4200                                                                  - - ctacagtacg tggatgactt actgctggcc gccacttctg agctagactg cc -             #aacaaggt   4260                                                                  - - actcgggccc tgttacaaac cctagggaac ctcgggtatc gggcctcggc ca -             #agaaagcc   4320                                                                  - - caaatttgcc agaaacaggt caagtatctg gggtatcttc taaaagaggg tc -             #agagatgg   4380                                                                  - - ctgactgagg ccagaaaaga gactgtgatg gggcagccta ctccgaagac cc -             #ctcgacaa   4440                                                                  - - ctaagggagt tcctagggac ggcaggcttc tgtcgcctct ggatccctgg gt -             #ttgcagaa   4500                                                                  - - atggcagccc ccttgtaccc tctcaccaaa acggggactc tgtttaattg gg -             #gcccagac   4560                                                                  - - caacaaaagg cctatcaaga aatcaagcaa gctcttctaa ctgccccagc cc -             #tggggttg   4620                                                                  - - ccagatttga ctaagccctt tgaactcttt gtcgacgaga agcagggcta cg -             #ccaaaggt   4680                                                                  - - gtcctaacgc aaaaactggg accttggcgt cggccggtgg cctacctgtc ca -             #aaaagcta   4740                                                                  - - gacccagtag cagctgggtg gcccccttgc ctacggatgg tagcagccat tg -             #ccgtactg   4800                                                                  - - acaaaggatg caggcaagct aaccatggga cagccactag tcattctggc cc -             #cccatgca   4860                                                                  - - gtagaggcac tagtcaaaca accccccgac cgctggcttt ccaacgcccg ga -             #tgactcac   4920                                                                  - - tatcaggcct tgcttttgga cacggaccgg gtccagttcg gaccggtggt ag -             #ccctgaac   4980                                                                  - - ccggctacgc tgctcccact gcctgaggaa gggctgcaac acaactgcct tg -             #atatcctg   5040                                                                  - - gccgaagccc acggaacccg acccgaccta acggaccagc cgctcccaga cg -             #ccgaccac   5100                                                                  - - acctggtaca cggatggaag cagtctctta caagagggac agcgtaaggc gg -             #gagctgcg   5160                                                                  - - gtgaccaccg agaccgaggt aatctgggct aaagccctgc cagccgggac at -             #ccgctcag   5220                                                                  - - cgggctgaac tgatagcact cacccaggcc ctaaagatgg cagaaggtaa ga -             #agctaaat   5280                                                                  - - gtttatactg atagccgtta tgcttttgct actgcccata tccatggaga aa -             #tatacaga   5340                                                                  - - aggcgtgggt tgctcacatc agaaggcaaa gagatcaaaa ataaagacga ga -             #tcttggcc   5400                                                                  - - ctactaaaag ccctctttct gcccaaaaga cttagcataa tccattgtcc ag -             #gacatcaa   5460                                                                  - - aagggacaca gcgccgaggc tagaggcaac cggatggctg accaagcggc cc -             #gaaaggca   5520                                                                  - - gccatcacag agactccaga cacctctacc ctcctcatag aaaattcatc ac -             #cctacacc   5580                                                                  - - tcagaacatt ttcattacac agtgactgat ataaaggacc taaccaagtt gg -             #gggccatt   5640                                                                  - - tatgataaaa caaagaagta ttgggtctac caaggaaaac ctgtgatgcc tg -             #accagttt   5700                                                                  - - acttttgaat tattagactt tcttcatcag ctgactcacc tcagcttctc aa -             #aaatgaag   5760                                                                  - - gctctcctag agagaagcca cagtccctac tacatgctga accgggatcg aa -             #cactcaaa   5820                                                                  - - aatatcactg agacctgcaa agcttgtgca caagtcaacg ccagcaagtc tg -             #ccgttaaa   5880                                                                  - - cagggaacta gggtccgcgg gcatcggccc ggcactcatt gggagatcga tt -             #tcaccgag   5940                                                                  - - ataaagcccg gattgtatgg ctataaatat cttctagttt ttatagatac ct -             #tttctggc   6000                                                                  - - tggatagaag ccttcccaac caagaaagaa accgccaagg tcgtaaccaa ga -             #agctacta   6060                                                                  - - gaggagatct tccccaggtt cggcatgcct caggtattgg gaactgacaa tg -             #ggcctgcc   6120                                                                  - - ttcgtctcca aggtgagtca gacagtggcc gatctgttgg ggattgattg ga -             #aattacat   6180                                                                  - - tgtgcataca gaccccaaag ctcaggccag gtagaaagaa tgaatagaac ca -             #tcaaggag   6240                                                                  - - actttaacta aattaacgct tgcaactggc tctagagact gggtgctcct ac -             #tcccctta   6300                                                                  - - gccctgtacc gagcccgcaa cacgccgggc ccccatggcc tcaccccata tg -             #agatctta   6360                                                                  - - tatggggcac ccccgcccct tgtaaacttc cctgaccctg acatgacaag ag -             #ttactaac   6420                                                                  - - agcccctctc tccaagctca cttacaggct ctctacttag tccagcacga ag -             #tctggaga   6480                                                                  - - cctctggcgg cagcctacca agaacaactg gaccgaccgg tggtacctca cc -             #cttaccga   6540                                                                  - - gtcggcgaca cagtgtgggt ccgccgacac cagactaaga acctagaacc tc -             #gctggaaa   6600                                                                  - - ggaccttaca cagtcctgct gaccaccccc accgccctca aagtagacgg ca -             #tcgcagct   6660                                                                  - - tggatacacg ccgcccacgt gaaggctgcc gaccccgggg gtggaccatc ct -             #ctagactg   6720                                                                  - - acatggcgcg ttcaacgctc tcaaaacccc ttaaaaataa ggttaacccg cg -             #aggccccc   6780                                                                  - - taatcccctt aattcttctg atgctcagag gggtcagtac tgcttcgccc gg -             #ctccagtg   6840                                                                  - - cggcccagcc ggccaccatg aaaacattta acatttctca acaagatcta ga -             #attagtag   6900                                                                  - - aagtagcgac agagaagatt acaatgcttt atgaggataa taaacatcat gt -             #gggagcgg   6960                                                                  - - caattcgtac gaaaacagga gaaatcattt cggcagtaca tattgaagcg ta -             #tataggac   7020                                                                  - - gagtaactgt ttgtgcagaa gccattgcga ttggtagtgc agtttcgaat gg -             #acaaaagg   7080                                                                  - - attttgacac gattgtagct gttagacacc cttattctga cgaagtagat ag -             #aagtattc   7140                                                                  - - gagtggtaag tccttgtggt atgtgtaggg agttgatttc agactatgca cc -             #agattgtt   7200                                                                  - - ttgtgttaat agaaatgaat ggcaagttag tcaaaactac gattgaagaa ct -             #cattccac   7260                                                                  - - tcaaatatac ccgaaattaa aagttttacc accaagctta tcgaattc  - #                   7308                                                                         - -  - - <210> SEQ ID NO 4                                                    <211> LENGTH: 7308                                                             <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Portion       of                                                                                     construct                                                                 - - <400> SEQUENCE: 4                                                          - - agatctcccg atcccctatg gtcgactctc agtacaatct gctctgatgc cg -             #catagtta     60                                                                  - - agccagtatc tgctccctgc ttgtgtgttg gaggtcgctg agtagtgcgc ga -             #gcaaaatt    120                                                                  - - taagctacaa caaggcaagg cttgaccgac aattgcatga agaatctgct ta -             #gggttagg    180                                                                  - - cgttttgcgc tgcttcgcga tgtacgggcc agatatacgc gttgacattg at -             #tattgact    240                                                                  - - agttattaat agtaatcaat tacggggtca ttagttcata gcccatatat gg -             #agttccgc    300                                                                  - - gttacataac ttacggtaaa tggcccgcct ggctgaccgc ccaacgaccc cc -             #gcccattg    360                                                                  - - acgtcaataa tgacgtatgt tcccatagta acgccaatag ggactttcca tt -             #gacgtcaa    420                                                                  - - tgggtggact atttacggta aactgcccac ttggcagtac atcaagtgta tc -             #atatgcca    480                                                                  - - agtacgcccc ctattgacgt caatgacggt aaatggcccg cctggcatta tg -             #cccagtac    540                                                                  - - atgaccttat gggactttcc tacttggcag tacatctacg tattagtcat cg -             #ctattacc    600                                                                  - - atggtgatgc ggttttggca gtacatcaat gggcgtggat agcggtttga ct -             #cacgggga    660                                                                  - - tttccaagtc tccaccccat tgacgtcaat gggagtttgt tttggcacca aa -             #atcaacgg    720                                                                  - - gactttccaa aatgtcgtaa caactccgcc ccattgacgc aaatgggcgg ta -             #ggcgtgta    780                                                                  - - cggtgggagg tctatataag cagagctctc tggctaacta gagaacccac tg -             #cttaactg    840                                                                  - - gcttatcgaa atgtcgactg agaacttcag ggtgagtttg gggacccttg at -             #tgttcttt    900                                                                  - - ctttttcgct attgtaaaat tcatgttata tggagggggc aaagttttca gg -             #gtgttgtt    960                                                                  - - tagaatggga agatgtccct tgtatcacca tggaccctca tgataatttt gt -             #ttctttca   1020                                                                  - - ctttctactc tgttgacaac cattgtctcc tcttattttc ttttcatttt ct -             #gtaacttt   1080                                                                  - - ttcgttaaac tttagcttgc atttgtaacg aatttttaaa ttcacttttg tt -             #tatttgtc   1140                                                                  - - agattgtaag tactttctct aatcactttt ttttcaaggc aatcagggta ta -             #ttatattg   1200                                                                  - - tacttcagca cagttttaga gaacaattgt tataattaaa tgataaggta ga -             #atatttct   1260                                                                  - - gcatataaat tctggctggc gtggaaatat tcttattggt agaaacaact ac -             #atcctggt   1320                                                                  - - catcatcctg cctttctctt tatggttaca atgatataca ctgtttgaga tg -             #aggataaa   1380                                                                  - - atactctgag tccaaaccgg gcccctctgc taaccatgtt catgccttct tc -             #tttttcct   1440                                                                  - - acagctcctg ggcaacgtgc tggttgttgt gctgtctcat cattttggca ag -             #aattggcc   1500                                                                  - - gcaagcttct gcagcatcgt tctgtgttgt ctctgtctga ctgtgtttct gt -             #atttgtct   1560                                                                  - - gagaatatgg gccagactgt taccactccc ttaagtttga ccttaggtca ct -             #ggaaagat   1620                                                                  - - gtcgagcgga tcgctcacaa ccagtcggta gatgtcaaga agagacgttg gg -             #ttaccttc   1680                                                                  - - tgctctgcag aatggccaac ctttaacgtc ggatggccgc gagacggcac ct -             #ttaaccga   1740                                                                  - - gacctcatca cccaggttaa gatcaaggtc ttttcacctg gcccgcatgg ac -             #acccagac   1800                                                                  - - caggtcccct acatcgtgac ctgggaagcc ttggcttttg acccccctcc ct -             #gggtcaag   1860                                                                  - - ccctttgtac accctaagcc tccgcctcct cttcctccat ccgccccgtc tc -             #tccccctt   1920                                                                  - - gaacctcctc gttcgacccc gcctcgatcc tccctttatc cagccctcac tc -             #cttctcta   1980                                                                  - - ggcgccaaac ctaaacctca agttctttct gacagtgggg ggccgctcat cg -             #acctactt   2040                                                                  - - acagaagacc ccccgcctta tagggaccca agaccacccc cttccgacag gg -             #acggaaat   2100                                                                  - - ggtggagaag cgacccctgc gggagaggca ccggacccct ccccaatggc at -             #ctcgccta   2160                                                                  - - cgtgggagac gggagccccc tgtggccgac tccactacct cgcaggcatt cc -             #ccctccgc   2220                                                                  - - gcaggaggaa acggacagct tcaatactgg ccgttctcct cttctgacct tt -             #acaactgg   2280                                                                  - - aaaaataata acccttcttt ttctgaagat ccaggtaaac tgacagctct ga -             #tcgagtct   2340                                                                  - - gttctcatca cccatcagcc cacctgggac gactgtcagc agctgttggg ga -             #ctctgctg   2400                                                                  - - accggagaag aaaaacaacg ggtgctctta gaggctagaa aggcggtgcg gg -             #gcgatgat   2460                                                                  - - gggcgcccca ctcaactgcc caatgaagtc gatgccgctt ttcccctcga gc -             #gcccagac   2520                                                                  - - tgggattaca ccacccaggc aggacgcaac cacctagtcc actatcgcca gt -             #tgctccta   2580                                                                  - - gcgggtctcc aaaacgcggg cagaagcccc accaatttgg ccaaggtaaa ag -             #gaataaca   2640                                                                  - - caagggccca atgagtctcc ctcggccttc ctagagagac ttaaggaagc ct -             #atcgcagg   2700                                                                  - - tacactcctt atgaccctga ggacccaggg caagaaacta atgtgtctat gt -             #ctttcatt   2760                                                                  - - tggcagtctg ccccagacat tgggagaaag ttagagaggt tagaagattt aa -             #aaaacaag   2820                                                                  - - acgcttggag atttggttag agaggcagaa aagatcttta ataaacgaga aa -             #ccccggaa   2880                                                                  - - gaaagagagg aacgtatcag gagagaaaca gaggaaaaag aagaacgccg ta -             #ggacagag   2940                                                                  - - gatgagcaga aagagaaaga aagagatcgt aggagacata gagagatgag ca -             #agctattg   3000                                                                  - - gccactgtcg ttagtggaca gaaacaggat agacagggag gagaacgaag ga -             #ggtcccaa   3060                                                                  - - ctcgatcgcg accagtgtgc ctactgcaaa gaaaaggggc actgggctaa ag -             #attgtccc   3120                                                                  - - aagaaaccac gaggacctcg gggaccaaga ccccagacct ccctcctgac cc -             #tagatgac   3180                                                                  - - tagggaggtc agggtcagga gcccccccct gaacccagga taaccctcaa ag -             #tcgggggg   3240                                                                  - - caacccgtca ccttcctggt agatactggg gcccaacact ccgtgctgac cc -             #aaaatcct   3300                                                                  - - ggacccctaa gtgataagtc tgcctgggtc caaggggcta ctggaggaaa gc -             #ggtatcgc   3360                                                                  - - tggaccacgg atcgcaaagt acatctagct accggtaagg tcacccactc tt -             #tcctccat   3420                                                                  - - gtaccagact gtccctatcc tctgttagga agagatttgc tgactaaact aa -             #aagcccaa   3480                                                                  - - atccactttg agggatcagg agctcaggtt atgggaccaa tggggcagcc cc -             #tgcaagtg   3540                                                                  - - ttgaccctaa atatagaaga tgagcatcgg ctacatgaga cctcaaaaga gc -             #cagatgtt   3600                                                                  - - tctctagggt ccacatggct gtctgatttt cctcaggcct gggcggaaac cg -             #ggggcatg   3660                                                                  - - ggactggcag ttcgccaagc tcctctgatc atacctctga aagcaacctc ta -             #cccccgtg   3720                                                                  - - tccataaaac aataccccat gtcacaagaa gccagactgg ggatcaagcc cc -             #acatacag   3780                                                                  - - agactgttgg accagggaat actggtaccc tgccagtccc cctggaacac gc -             #ccctgcta   3840                                                                  - - cccgttaaga aaccagggac taatgattat aggcctgtcc aggatctgag ag -             #aagtcaac   3900                                                                  - - aagcgggtgg aagacatcca ccccaccgtg cccaaccctt acaacctctt ga -             #gcgggctc   3960                                                                  - - ccaccgtccc accagtggta cactgtgctt gatttaaagg atgccttttt ct -             #gcctgaga   4020                                                                  - - ctccacccca ccagtcagcc tctcttcgcc tttgagtgga gagatccaga ga -             #tgggaatc   4080                                                                  - - tcaggacaat tgacctggac cagactccca cagggtttca aaaacagtcc ca -             #ccctgttt   4140                                                                  - - gatgaggcac tgcacagaga cctagcagac ttccggatcc agcacccaga ct -             #tgatcctg   4200                                                                  - - ctacagtacg tggatgactt actgctggcc gccacttctg agctagactg cc -             #aacaaggt   4260                                                                  - - actcgggccc tgttacaaac cctagggaac ctcgggtatc gggcctcggc ca -             #agaaagcc   4320                                                                  - - caaatttgcc agaaacaggt caagtatctg gggtatcttc taaaagaggg tc -             #agagatgg   4380                                                                  - - ctgactgagg ccagaaaaga gactgtgatg gggcagccta ctccgaagac cc -             #ctcgacaa   4440                                                                  - - ctaagggagt tcctagggac ggcaggcttc tgtcgcctct ggatccctgg gt -             #ttgcagaa   4500                                                                  - - atggcagccc ccttgtaccc tctcaccaaa acggggactc tgtttaattg gg -             #gcccagac   4560                                                                  - - caacaaaagg cctatcaaga aatcaagcaa gctcttctaa ctgccccagc cc -             #tggggttg   4620                                                                  - - ccagatttga ctaagccctt tgaactcttt gtcgacgaga agcagggcta cg -             #ccaaaggt   4680                                                                  - - gtcctaacgc aaaaactggg accttggcgt cggccggtgg cctacctgtc ca -             #aaaagcta   4740                                                                  - - gacccagtag cagctgggtg gcccccttgc ctacggatgg tagcagccat tg -             #ccgtactg   4800                                                                  - - acaaaggatg caggcaagct aaccatggga cagccactag tcattctggc cc -             #cccatgca   4860                                                                  - - gtagaggcac tagtcaaaca accccccgac cgctggcttt ccaacgcccg ga -             #tgactcac   4920                                                                  - - tatcaggcct tgcttttgga cacggaccgg gtccagttcg gaccggtggt ag -             #ccctgaac   4980                                                                  - - ccggctacgc tgctcccact gcctgaggaa gggctgcaac acaactgcct tg -             #atatcctg   5040                                                                  - - gccgaagccc acggaacccg acccgaccta acggaccagc cgctcccaga cg -             #ccgaccac   5100                                                                  - - acctggtaca cggatggaag cagtctctta caagagggac agcgtaaggc gg -             #gagctgcg   5160                                                                  - - gtgaccaccg agaccgaggt aatctgggct aaagccctgc cagccgggac at -             #ccgctcag   5220                                                                  - - cgggctgaac tgatagcact cacccaggcc ctaaagatgg cagaaggtaa ga -             #agctaaat   5280                                                                  - - gtttatactg atagccgtta tgcttttgct actgcccata tccatggaga aa -             #tatacaga   5340                                                                  - - aggcgtgggt tgctcacatc agaaggcaaa gagatcaaaa ataaagacga ga -             #tcttggcc   5400                                                                  - - ctactaaaag ccctctttct gcccaaaaga cttagcataa tccattgtcc ag -             #gacatcaa   5460                                                                  - - aagggacaca gcgccgaggc tagaggcaac cggatggctg accaagcggc cc -             #gaaaggca   5520                                                                  - - gccatcacag agactccaga cacctctacc ctcctcatag aaaattcatc ac -             #cctacacc   5580                                                                  - - tcagaacatt ttcattacac agtgactgat ataaaggacc taaccaagtt gg -             #gggccatt   5640                                                                  - - tatgataaaa caaagaagta ttgggtctac caaggaaaac ctgtgatgcc tg -             #accagttt   5700                                                                  - - acttttgaat tattagactt tcttcatcag ctgactcacc tcagcttctc aa -             #aaatgaag   5760                                                                  - - gctctcctag agagaagcca cagtccctac tacatgctga accgggatcg aa -             #cactcaaa   5820                                                                  - - aatatcactg agacctgcaa agcttgtgca caagtcaacg ccagcaagtc tg -             #ccgttaaa   5880                                                                  - - cagggaacta gggtccgcgg gcatcggccc ggcactcatt gggagatcga tt -             #tcaccgag   5940                                                                  - - ataaagcccg gattgtatgg ctataaatat cttctagttt ttatagatac ct -             #tttctggc   6000                                                                  - - tggatagaag ccttcccaac caagaaagaa accgccaagg tcgtaaccaa ga -             #agctacta   6060                                                                  - - gaggagatct tccccaggtt cggcatgcct caggtattgg gaactgacaa tg -             #ggcctgcc   6120                                                                  - - ttcgtctcca aggtgagtca gacagtggcc gatctgttgg ggattgattg ga -             #aattacat   6180                                                                  - - tgtgcataca gaccccaaag ctcaggccag gtagaaagaa tgaatagaac ca -             #tcaaggag   6240                                                                  - - actttaacta aattaacgct tgcaactggc tctagagact gggtgctcct ac -             #tcccctta   6300                                                                  - - gccctgtacc gagcccgcaa cacgccgggc ccccatggcc tcaccccata tg -             #agatctta   6360                                                                  - - tatggggcac ccccgcccct tgtaaacttc cctgaccctg acatgacaag ag -             #ttactaac   6420                                                                  - - agcccctctc tccaagctca cttacaggct ctctacttag tccagcacga ag -             #tctggaga   6480                                                                  - - cctctggcgg cagcctacca agaacaactg gaccgaccgg tggtacctca cc -             #cttaccga   6540                                                                  - - gtcggcgaca cagtgtgggt ccgccgacac cagactaaga acctagaacc tc -             #gctggaaa   6600                                                                  - - ggaccttaca cagtcctgct gaccaccccc accgccctca aagtagacgg ca -             #tcgcagct   6660                                                                  - - tggatacacg ccgcccacgt gaaggctgcc gaccccgggg gtggaccatc ct -             #ctagactg   6720                                                                  - - acatggcgcg ttcaacgctc tcaaaacccc ttaaaaataa ggttaacccg cg -             #aggccccc   6780                                                                  - - taatcccctt aattcttctg atgctcagag gggtcagtac tgcttcgccc gg -             #ctccagtg   6840                                                                  - - cggcccagcc ggccaccatg aaaacattta acatttctca acaagatcta ga -             #attagtag   6900                                                                  - - aagtagcgac agagaagatt acaatgcttt atgaggataa taaacatcat gt -             #gggagcgg   6960                                                                  - - caattcgtac gaaaacagga gaaatcattt cggcagtaca tattgaagcg ta -             #tataggac   7020                                                                  - - gagtaactgt ttgtgcagaa gccattgcga ttggtagtgc agtttcgaat gg -             #acaaaagg   7080                                                                  - - attttgacac gattgtagct gttagacacc cttattctga cgaagtagat ag -             #aagtattc   7140                                                                  - - gagtggtaag tccttgtggt atgtgtaggg agttgatttc agactatgca cc -             #agattgtt   7200                                                                  - - ttgtgttaat agaaatgaat ggcaagttag tcaaaactac gattgaagaa ct -             #cattccac   7260                                                                  - - tcaaatatac ccgaaattaa aagttttacc accaagctta tcgaattc  - #                   7308                                                                         - -  - - <210> SEQ ID NO 5                                                    <211> LENGTH: 6028                                                             <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Portion       of                                                                                     construct                                                                <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3774)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3775)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3776)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3777)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3962)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3963)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3964)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3965)                                                         <223> OTHER INFORMATION: n is any nucleotide                                    - - <400> SEQUENCE: 5                                                          - - catatgcggt gtgaaatacc gcacagatgc gtaaggagaa aataccgcat ca -             #ggcgccat     60                                                                  - - tcgccattca ggctgcgcaa ctgttgggaa gggcgatcgg tgcgggcctc tt -             #cgctatta    120                                                                  - - cgccagctgg cgaaaggggg atgtgctgca aggcgattaa gttgggtaac gc -             #cagggttt    180                                                                  - - tcccagtcac gacgttgtaa aacgacggcc agtgaattcc gattagttca at -             #ttgttaaa    240                                                                  - - gacaggatct cagtagtcca ggctttagtc ctgactcaac aataccacca gc -             #taaaacca    300                                                                  - - ctagaatacg agccacaata aataaaagat tttatttagt ttccagaaaa ag -             #gggggaat    360                                                                  - - gaaagacccc accaaattgc ttagcctgat agccgcagta acgccatttt gc -             #aaggcatg    420                                                                  - - gaaaaatacc aaaccaagaa tagagaagtt cagatcaagg gcgggtacac ga -             #aaacagct    480                                                                  - - aacgttgggc caaacaggat atctgcggtg agcagtttcg gccccggccc gg -             #ggccaaga    540                                                                  - - acagatggtc accgcggttc ggccccggcc cggggccaag aacagatggt cc -             #ccagatat    600                                                                  - - ggcccaaccc tcagcagttt cttaagaccc atcagatgtt tccaggctcc cc -             #caaggacc    660                                                                  - - tgaaatgacc ctgtgcctta tttgaattaa ccaatcagcc tgcttctcgc tt -             #ctgttcgc    720                                                                  - - gcgcttctgc ttcccgagct ctataaaaga gctcacaacc cctcactcgg cg -             #cgccagtc    780                                                                  - - ctccgataga ctgagtcgcc cgggtacccg tgtatccaat aaatcctctt gc -             #tgttgcat    840                                                                  - - ccgactcgtg gtctcgctgt tccttgggag ggtctcctca gagtgattga ct -             #acccgtct    900                                                                  - - cgggggtctt tcatttgggg gctcgtccgg gatctggaga cccctgccca gg -             #gaccaccg    960                                                                  - - acccaccacc gggaggtaag ctggccaaga tcttatatgg ggcacccccg cc -             #ccttgtaa   1020                                                                  - - acttccctga ccctgacatg accagagtta ctaacagccc ctctctccaa gc -             #tcacttac   1080                                                                  - - aggctctcta cttagtccag cacgaagttt ggagaccact ggcggcagct ta -             #ccaagaac   1140                                                                  - - aactggaccg gccggtggtg cctcaccctt accgggtcgg cgacacagtg tg -             #ggtccgcc   1200                                                                  - - gacatcaaac caagaaccta gaacctcgct ggaaaggacc ttacacagtc ct -             #gctgacca   1260                                                                  - - cccccaccgc cctcaaagta gacggtatcg cagcttggat acacgcagcc ca -             #cgtaaagg   1320                                                                  - - cggccgacac cgagagtgga ccatcctctg gacggacatg gcgcgttcaa cg -             #ctctcaaa   1380                                                                  - - accccctcaa gataagatta acccgtggaa gcccttaata gtcatgggag tc -             #ctgttagg   1440                                                                  - - agtagggatg gcagagagcc cccatcaggt ctttaatgta acctggagag tc -             #accaacct   1500                                                                  - - gatgactggg cgtaccgcca atgccacctc cctcctggga actgtacaag at -             #gccttccc   1560                                                                  - - aaaattatat tttgatctat gtgatctggt cggagaggag tgggaccctt ca -             #gaccagga   1620                                                                  - - accgtatgtc gggtatggct gcaagtaccc cgcagggaga cagcggaccc gg -             #acttttga   1680                                                                  - - cttttacgtg tgccctgggc ataccgtaaa gtcggggtgt gggggaccag ga -             #gagggcta   1740                                                                  - - ctgtggtaaa tgggggtgtg aaaccaccgg acaggcttac tggaagccca ca -             #tcatcgtg   1800                                                                  - - ggacctaatc tcccttaagc gcggtaacac cccctgggac acgggatgct ct -             #aaagttgc   1860                                                                  - - ctgtggcccc tgctacgacc tctccaaagt atccaattcc ttccaagggg ct -             #actcgagg   1920                                                                  - - gggcagatgc aaccctctag tcctagaatt cactgatgca ggaaaaaagg ct -             #aactggga   1980                                                                  - - cgggcccaaa tcgtggggac tgagactgta ccggacagga acagatccta tt -             #accatgtt   2040                                                                  - - ctccctgacc cggcaggtcc ttaatgtggg accccgagtc cccatagggc cc -             #aacccagt   2100                                                                  - - attacccgac caaagactcc cttcctcacc aatagagatt gtaccggctc ca -             #cagccacc   2160                                                                  - - tagccccctc aataccagtt accccccttc cactaccagt acaccctcaa cc -             #tcccctac   2220                                                                  - - aagtccaagt gtcccacagc cacccccagg aactggagat agactactag ct -             #ctagtcaa   2280                                                                  - - aggagcctat caggcgctta acctcaccaa tcccgacaag acccaagaat gt -             #tggctgtg   2340                                                                  - - cttagtgtcg ggacctcctt attacgaagg agtagcggtc gtgggcactt at -             #accaatca   2400                                                                  - - ttccaccgct ccggccaact gtacggccac ttcccaacat aagcttaccc ta -             #tctgaagt   2460                                                                  - - gacaggacag ggcctatgca tgggggcagt acctaaaact caccaggcct ta -             #tgtaacac   2520                                                                  - - cacccaaagc gccggctcag gatcctacta ccttgcagca cccgccggaa ca -             #atgtgggc   2580                                                                  - - ttgcagcact ggattgactc cctgcttgtc caccacggtg ctcaatctaa cc -             #acagatta   2640                                                                  - - ttgtgtatta gttgaactct ggcccagagt aatttaccac tcccccgatt at -             #atgtatgg   2700                                                                  - - tcagcttgaa cagcgtacca aatataaaag agagccagta tcattgaccc tg -             #gcccttct   2760                                                                  - - actaggagga ttaaccatgg gagggattgc agctggaata gggacgggga cc -             #actgcctt   2820                                                                  - - aattaaaacc cagcagtttg agcagcttca tgccgctatc cagacagacc tc -             #aacgaagt   2880                                                                  - - cgaaaagtca attaccaacc tagaaaagtc actgacctcg ttgtctgaag ta -             #gtcctaca   2940                                                                  - - gaaccgcaga ggcctagatt tgctattcct aaaggaggga ggtctctgcg ca -             #gccctaaa   3000                                                                  - - agaagaatgt tgtttttatg cagaccacac ggggctagtg agagacagca tg -             #gccaaatt   3060                                                                  - - aagagaaagg cttaatcaga gacaaaaact atttgagaca ggccaaggat gg -             #ttcgaagg   3120                                                                  - - gctgtttaat agatccccct ggtttaccac cttaatctcc accatcatgg ga -             #cctctaat   3180                                                                  - - agtactctta ctgatcttac tctttggacc ttgcattctc aatcgattag tt -             #caatttgt   3240                                                                  - - taaagacagg atctcagtag tccaggcttt agtcctgact caacaatacc ac -             #cagctaaa   3300                                                                  - - gcctatagag tacgagccat agggcgccta gtgttgacaa ttaatcatcg gc -             #atagtata   3360                                                                  - - cggcatagta taatacgact cactatagga gggccaccat ggccaagttg ac -             #cagtgccg   3420                                                                  - - ttccggtgct caccgcgcgc gacgtcgccg gagcggtcga gttctggacc ga -             #ccggctcg   3480                                                                  - - ggttctcccg ggacttcgtg gaggacgact tcgccggtgt ggtccgggac ga -             #cgtgaccc   3540                                                                  - - tgttcatcag cgcggtccag gaccaggtgg tgccggacaa caccctggcc tg -             #ggtgtggg   3600                                                                  - - tgcgcggcct ggacgagctg tacgccgagt ggtcggaggt cgtgtccacg aa -             #cttccggg   3660                                                                  - - acgcctccgg gccggccatg accgagatcg gcgagcagcc gtgggggcgg ga -             #gttcgccc   3720                                                                  - - tgcgcgaccc ggccggcaac tgcgtgcact tcgtggccga ggagcaggac tg -             #annnncgg   3780                                                                  - - accggtcgac ttgttaactt gtttattgca gcttataatg gttacaaata aa -             #gcaatagc   3840                                                                  - - atcacaaatt tcacaaataa agcatttttt tcactgcatt ctagttgtgg tt -             #tgtccaaa   3900                                                                  - - ctcatcaatg tatcttatca tgtctggatc cagatctggg cccatgcggc cg -             #cggatcga   3960                                                                  - - tnnnnacatg tgagcaaaag gccagcaaaa ggccaggaac cgtaaaaagg cc -             #gcgttgct   4020                                                                  - - ggcgtttttc cataggctcc gcccccctga cgagcatcac aaaaatcgac gc -             #tcaagtca   4080                                                                  - - gaggtggcga aacccgacag gactataaag ataccaggcg tttccccctg ga -             #agctccct   4140                                                                  - - cgtgcgctct cctgttccga ccctgccgct taccggatac ctgtccgcct tt -             #ctcccttc   4200                                                                  - - gggaagcgtg gcgctttctc aatgctcacg ctgtaggtat ctcagttcgg tg -             #taggtcgt   4260                                                                  - - tcgctccaag ctgggctgtg tgcacgaacc ccccgttcag cccgaccgct gc -             #gccttatc   4320                                                                  - - cggtaactat cgtcttgagt ccaacccggt aagacacgac ttatcgccac tg -             #gcagcagc   4380                                                                  - - cactggtaac aggattagca gagcgaggta tgtaggcggt gctacagagt tc -             #ttgaagtg   4440                                                                  - - gtggcctaac tacggctaca ctagaaggac agtatttggt atctgcgctc tg -             #ctgaagcc   4500                                                                  - - agttaccttc ggaaaaagag ttggtagctc ttgatccggc aaacaaacca cc -             #gctggtag   4560                                                                  - - cggtggtttt tttgtttgca agcagcagat tacgcgcaga aaaaaaggat ct -             #caagaaga   4620                                                                  - - tcctttgatc ttttctacgg ggtctgacgc tcagtggaac gaaaactcac gt -             #taagggat   4680                                                                  - - tttggtcatg agattatcaa aaaggatctt cacctagatc cttttaaatt aa -             #aaatgaag   4740                                                                  - - ttttaaatca atctaaagta tatatgagta aacttggtct gacagttacc aa -             #tgcttaat   4800                                                                  - - cagtgaggca cctatctcag cgatctgtct atttcgttca tccatagttg cc -             #tgactccc   4860                                                                  - - cgtcgtgtag ataactacga tacgggaggg cttaccatct ggccccagtg ct -             #gcaatgat   4920                                                                  - - accgcgagac ccacgctcac cggctccaga tttatcagca ataaaccagc ca -             #gccggaag   4980                                                                  - - ggccgagcgc agaagtggtc ctgcaacttt atccgcctcc atccagtcta tt -             #aattgttg   5040                                                                  - - ccgggaagct agagtaagta gttcgccagt taatagtttg cgcaacgttg tt -             #gccattgc   5100                                                                  - - tacaggcatc gtggtgtcac gctcgtcgtt tggtatggct tcattcagct cc -             #ggttccca   5160                                                                  - - acgatcaagg cgagttacat gatcccccat gttgtgcaaa aaagcggtta gc -             #tccttcgg   5220                                                                  - - tcctccgatc gttgtcagaa gtaagttggc cgcagtgtta tcactcatgg tt -             #atggcagc   5280                                                                  - - actgcataat tctcttactg tcatgccatc cgtaagatgc ttttctgtga ct -             #ggtgagta   5340                                                                  - - ctcaaccaag tcattctgag aatagtgtat gcggcgaccg agttgctctt gc -             #ccggcgtc   5400                                                                  - - aatacgggat aataccgcgc cacatagcag aactttaaaa gtgctcatca tt -             #ggaaaacg   5460                                                                  - - ttcttcgggg cgaaaactct caaggatctt accgctgttg agatccagtt cg -             #atgtaacc   5520                                                                  - - cactcgtgca cccaactgat cttcagcatc ttttactttc accagcgttt ct -             #gggtgagc   5580                                                                  - - aaaaacagga aggcaaaatg ccgcaaaaaa gggaataagg gcgacacgga aa -             #tgttgaat   5640                                                                  - - actcatactc ttcctttttc aatattattg aagcatttat cagggttatt gt -             #ctcatgag   5700                                                                  - - cggatacata tttgaatgta tttagaaaaa taaacaaata ggggttccgc gc -             #acatttcc   5760                                                                  - - ccgaaaagtg ccacctgacg tctaagaaac cattattatc atgacattaa cc -             #tataaaaa   5820                                                                  - - taggcgtatc acgaggccct ttcgtctcgc gcgtttcggt gatgacggtg aa -             #aacctctg   5880                                                                  - - acacatgcag ctcccggaga cggtcacagc ttgtctgtaa gcggatgccg gg -             #agcagaca   5940                                                                  - - agcccgtcag ggcgcgtcag cgggtgttgg cgggtgtcgg ggctggctta ac -             #tatgcggc   6000                                                                  - - atcagagcag attgtactga gagtgcac         - #                  - #                6028                                                                      - -  - - <210> SEQ ID NO 6                                                    <211> LENGTH: 6061                                                             <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Portion       of                                                                                     construct                                                                <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3807)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3808)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3809)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3810)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3995)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3996)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3997)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3998)                                                         <223> OTHER INFORMATION: n is any nucleotide                                    - - <400> SEQUENCE: 6                                                          - - catatgcggt gtgaaatacc gcacagatgc gtaaggagaa aataccgcat ca -             #ggcgccat     60                                                                  - - tcgccattca ggctgcgcaa ctgttgggaa gggcgatcgg tgcgggcctc tt -             #cgctatta    120                                                                  - - cgccagctgg cgaaaggggg atgtgctgca aggcgattaa gttgggtaac gc -             #cagggttt    180                                                                  - - tcccagtcac gacgttgtaa aacgacggcc agtgaattcc gattagttca at -             #ttgttaaa    240                                                                  - - gacaggatct cagtagtcca ggctttagtc ctgactcaac aataccacca gc -             #taaaacca    300                                                                  - - ctagaatacg agccacaata aataaaagat tttatttagt ttccagaaaa ag -             #gggggaat    360                                                                  - - gaaagacccc accaaattgc ttagcctgat agccgcagta acgccatttt gc -             #aaggcatg    420                                                                  - - gaaaaatacc aaaccaagaa tagagaagtt cagatcaagg gcgggtacac ga -             #aaacagct    480                                                                  - - aacgttgggc caaacaggat atctgcggtg agcagtttcg gccccggccc gg -             #ggccaaga    540                                                                  - - acagatggtc accgcggttc ggccccggcc cggggccaag aacagatggt cc -             #ccagatat    600                                                                  - - ggcccaaccc tcagcagttt cttaagaccc atcagatgtt tccaggctcc cc -             #caaggacc    660                                                                  - - tgaaatgacc ctgtgcctta tttgaattaa ccaatcagcc tgcttctcgc tt -             #ctgttcgc    720                                                                  - - gcgcttctgc ttcccgagct ctataaaaga gctcacaacc cctcactcgg cg -             #cgccagtc    780                                                                  - - ctccgataga ctgagtcgcc cgggtacccg tgtatccaat aaatcctctt gc -             #tgttgcat    840                                                                  - - ccgactcgtg gtctcgctgt tccttgggag ggtctcctca gagtgattga ct -             #acccgtct    900                                                                  - - cgggggtctt tcatttgggg gctcgtccgg gatctggaga cccctgccca gg -             #gaccaccg    960                                                                  - - acccaccacc gggaggtaag ctggccaaga tcttatatgg ggcacccccg cc -             #ccttgtaa   1020                                                                  - - acttccctga ccctgacatg acaagagtta ctaacagccc ctctctccaa gc -             #tcacttac   1080                                                                  - - aggctctcta cttagtccag cacgaagtct ggagacctct ggcggcagcc ta -             #ccaagaac   1140                                                                  - - aactggaccg accggtggta cctcaccctt accgagtcgg cgacacagtg tg -             #ggtccgcc   1200                                                                  - - gacaccagac taagaaccta gaacctcgct ggaaaggacc ttacacagtc ct -             #gctgacca   1260                                                                  - - cccccaccgc cctcaaagta gacggcatcg cagcttggat acacgccgcc ca -             #cgtgaagg   1320                                                                  - - ctgccgaccc cgggggtgga ccatcctcta gactgacatg gcgcgttcaa cg -             #ctctcaaa   1380                                                                  - - accccttaaa aataaggtta acccgcgagg ccccctaatc cccttaattc tt -             #ctgatgct   1440                                                                  - - cagaggggtc agtactgctt cgcccggctc cagtcctcat caagtctata at -             #atcacctg   1500                                                                  - - ggaggtaacc aatggagatc gggagacggt atgggcaact tctggcaacc ac -             #cctctgtg   1560                                                                  - - gacctggtgg cctgacctta ccccagattt atgtatgtta gcccaccatg ga -             #ccatctta   1620                                                                  - - ttgggggcta gaatatcaat cccctttttc ttctcccccg gggccccctt gt -             #tgctcagg   1680                                                                  - - gggcagcagc ccaggctgtt ccagagactg cgaagaacct ttaacctccc tc -             #acccctcg   1740                                                                  - - gtgcaacact gcctggaaca gactcaagct agaccagaca actcataaat ca -             #aatgaggg   1800                                                                  - - attttatgtt tgccccgggc cccaccgccc ccgagaatcc aagtcatgtg gg -             #ggtccaga   1860                                                                  - - ctccttctac tgtgcctatt ggggctgtga gacaaccggt agagcttact gg -             #aagccctc   1920                                                                  - - ctcatcatgg gatttcatca cagtaaacaa caatctcacc tctgaccagg ct -             #gtccaggt   1980                                                                  - - atgcaaagat aataagtggt gcaacccctt agttattcgg tttacagacg cc -             #gggagacg   2040                                                                  - - ggttacttcc tggaccacag gacattactg gggcttacgt ttgtatgtct cc -             #ggacaaga   2100                                                                  - - tccagggctt acatttggga tccgactcag ataccaaaat ctaggacccc gc -             #gtcccaat   2160                                                                  - - agggccaaac cccgttctgg cagaccaaca gccactctcc aagcccaaac ct -             #gttaagtc   2220                                                                  - - gccttcagtc accaaaccac ccagtgggac tcctctctcc cctacccaac tt -             #ccaccggc   2280                                                                  - - gggaacggaa aataggctgc taaacttagt agacggagcc taccaagccc tc -             #aacctcac   2340                                                                  - - cagtcctgac aaaacccaag agtgctggtt gtgtctagta gcgggacccc cc -             #tactacga   2400                                                                  - - aggggttgcc gtcctgggta cctactccaa ccatacctct gctccagcca ac -             #tgctccgt   2460                                                                  - - ggcctcccaa cacaagttga ccctgtccga agtgaccgga cagggactct gc -             #ataggagc   2520                                                                  - - agttcccaaa acacatcagg ccctatgtaa taccacccag acaagcagtc ga -             #gggtccta   2580                                                                  - - ttatctagtt gcccctacag gtaccatgtg ggcttgtagt accgggctta ct -             #ccatgcat   2640                                                                  - - ctccaccacc atactgaacc ttaccactga ttattgtgtt cttgtcgaac tc -             #tggccaag   2700                                                                  - - agtcacctat cattccccca gctatgttta cggcctgttt gagagatcca ac -             #cgacacaa   2760                                                                  - - aagagaaccg gtgtcgttaa ccctggccct attattgggt ggactaacca tg -             #gggggaat   2820                                                                  - - tgccgctgga ataggaacag ggactactgc tctaatggcc actcagcaat tc -             #cagcagct   2880                                                                  - - ccaagccgca gtacaggatg atctcaggga ggttgaaaaa tcaatctcta ac -             #ctagaaaa   2940                                                                  - - gtctctcact tccctgtctg aagttgtcct acagaatcga aggggcctag ac -             #ttgttatt   3000                                                                  - - tctaaaagaa ggagggctgt gtgctgctct aaaagaagaa tgttgcttct at -             #gcggacca   3060                                                                  - - cacaggacta gtgagagaca gcatggccaa attgagagag aggcttaatc ag -             #agacagaa   3120                                                                  - - actgtttgag tcaactcaag gatggtttga gggactgttt aacagatccc ct -             #tggtttac   3180                                                                  - - caccttgata tctaccatta tgggacccct cattgtactc ctaatgattt tg -             #ctcttcgg   3240                                                                  - - accctgcatt cttaatcgat tagttcaatt tgttaaagac aggatctcag ta -             #gtccaggc   3300                                                                  - - tttagtcctg actcaacaat accaccagct aaagcctata gagtacgagc ca -             #tagggcgc   3360                                                                  - - ctagtgttga caattaatca tcggcatagt atacggcata gtataatacg ac -             #tcactata   3420                                                                  - - ggagggccac catggccaag ttgaccagtg ccgttccggt gctcaccgcg cg -             #cgacgtcg   3480                                                                  - - ccggagcggt cgagttctgg accgaccggc tcgggttctc ccgggacttc gt -             #ggaggacg   3540                                                                  - - acttcgccgg tgtggtccgg gacgacgtga ccctgttcat cagcgcggtc ca -             #ggaccagg   3600                                                                  - - tggtgccgga caacaccctg gcctgggtgt gggtgcgcgg cctggacgag ct -             #gtacgccg   3660                                                                  - - agtggtcgga ggtcgtgtcc acgaacttcc gggacgcctc cgggccggcc at -             #gaccgaga   3720                                                                  - - tcggcgagca gccgtggggg cgggagttcg ccctgcgcga cccggccggc aa -             #ctgcgtgc   3780                                                                  - - acttcgtggc cgaggagcag gactgannnn cggaccggtc gacttgttaa ct -             #tgtttatt   3840                                                                  - - gcagcttata atggttacaa ataaagcaat agcatcacaa atttcacaaa ta -             #aagcattt   3900                                                                  - - ttttcactgc attctagttg tggtttgtcc aaactcatca atgtatctta tc -             #atgtctgg   3960                                                                  - - atccagatct gggcccatgc ggccgcggat cgatnnnnac atgtgagcaa aa -             #ggccagca   4020                                                                  - - aaaggccagg aaccgtaaaa aggccgcgtt gctggcgttt ttccataggc tc -             #cgcccccc   4080                                                                  - - tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg cgaaacccga ca -             #ggactata   4140                                                                  - - aagataccag gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc cg -             #accctgcc   4200                                                                  - - gcttaccgga tacctgtccg cctttctccc ttcgggaagc gtggcgcttt ct -             #caatgctc   4260                                                                  - - acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc aagctgggct gt -             #gtgcacga   4320                                                                  - - accccccgtt cagcccgacc gctgcgcctt atccggtaac tatcgtcttg ag -             #tccaaccc   4380                                                                  - - ggtaagacac gacttatcgc cactggcagc agccactggt aacaggatta gc -             #agagcgag   4440                                                                  - - gtatgtaggc ggtgctacag agttcttgaa gtggtggcct aactacggct ac -             #actagaag   4500                                                                  - - gacagtattt ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa ga -             #gttggtag   4560                                                                  - - ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gc -             #aagcagca   4620                                                                  - - gattacgcgc agaaaaaaag gatctcaaga agatcctttg atcttttcta cg -             #gggtctga   4680                                                                  - - cgctcagtgg aacgaaaact cacgttaagg gattttggtc atgagattat ca -             #aaaaggat   4740                                                                  - - cttcacctag atccttttaa attaaaaatg aagttttaaa tcaatctaaa gt -             #atatatga   4800                                                                  - - gtaaacttgg tctgacagtt accaatgctt aatcagtgag gcacctatct ca -             #gcgatctg   4860                                                                  - - tctatttcgt tcatccatag ttgcctgact ccccgtcgtg tagataacta cg -             #atacggga   4920                                                                  - - gggcttacca tctggcccca gtgctgcaat gataccgcga gacccacgct ca -             #ccggctcc   4980                                                                  - - agatttatca gcaataaacc agccagccgg aagggccgag cgcagaagtg gt -             #cctgcaac   5040                                                                  - - tttatccgcc tccatccagt ctattaattg ttgccgggaa gctagagtaa gt -             #agttcgcc   5100                                                                  - - agttaatagt ttgcgcaacg ttgttgccat tgctacaggc atcgtggtgt ca -             #cgctcgtc   5160                                                                  - - gtttggtatg gcttcattca gctccggttc ccaacgatca aggcgagtta ca -             #tgatcccc   5220                                                                  - - catgttgtgc aaaaaagcgg ttagctcctt cggtcctccg atcgttgtca ga -             #agtaagtt   5280                                                                  - - ggccgcagtg ttatcactca tggttatggc agcactgcat aattctctta ct -             #gtcatgcc   5340                                                                  - - atccgtaaga tgcttttctg tgactggtga gtactcaacc aagtcattct ga -             #gaatagtg   5400                                                                  - - tatgcggcga ccgagttgct cttgcccggc gtcaatacgg gataataccg cg -             #ccacatag   5460                                                                  - - cagaacttta aaagtgctca tcattggaaa acgttcttcg gggcgaaaac tc -             #tcaaggat   5520                                                                  - - cttaccgctg ttgagatcca gttcgatgta acccactcgt gcacccaact ga -             #tcttcagc   5580                                                                  - - atcttttact ttcaccagcg tttctgggtg agcaaaaaca ggaaggcaaa at -             #gccgcaaa   5640                                                                  - - aaagggaata agggcgacac ggaaatgttg aatactcata ctcttccttt tt -             #caatatta   5700                                                                  - - ttgaagcatt tatcagggtt attgtctcat gagcggatac atatttgaat gt -             #atttagaa   5760                                                                  - - aaataaacaa ataggggttc cgcgcacatt tccccgaaaa gtgccacctg ac -             #gtctaaga   5820                                                                  - - aaccattatt atcatgacat taacctataa aaataggcgt atcacgaggc cc -             #tttcgtct   5880                                                                  - - cgcgcgtttc ggtgatgacg gtgaaaacct ctgacacatg cagctcccgg ag -             #acggtcac   5940                                                                  - - agcttgtctg taagcggatg ccgggagcag acaagcccgt cagggcgcgt ca -             #gcgggtgt   6000                                                                  - - tggcgggtgt cggggctggc ttaactatgc ggcatcagag cagattgtac tg -             #agagtgca   6060                                                                  - - c                  - #                  - #                  - #                  6061                                                                   - -  - - <210> SEQ ID NO 7                                                    <211> LENGTH: 6312                                                             <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Portion       of                                                                                     construct                                                                <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (4058)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (4059)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (4060)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (4061)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (4246)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (4247)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (4248)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (4249)                                                         <223> OTHER INFORMATION: n is any nucleotide                                    - - <400> SEQUENCE: 7                                                          - - catatgcggt gtgaaatacc gcacagatgc gtaaggagaa aataccgcat ca -             #ggcgccat     60                                                                  - - tcgccattca ggctgcgcaa ctgttgggaa gggcgatcgg tgcgggcctc tt -             #cgctatta    120                                                                  - - cgccagctgg cgaaaggggg atgtgctgca aggcgattaa gttgggtaac gc -             #cagggttt    180                                                                  - - tcccagtcac gacgttgtaa aacgacggcc agtgaattcc gattagttca at -             #ttgttaaa    240                                                                  - - gacaggatct cagtagtcca ggctttagtc ctgactcaac aataccacca gc -             #taaaacca    300                                                                  - - ctagaatacg agccacaata aataaaagat tttatttagt ttccagaaaa ag -             #gggggaat    360                                                                  - - gaaagacccc accaaattgc ttagcctgat agccgcagta acgccatttt gc -             #aaggcatg    420                                                                  - - gaaaaatacc aaaccaagaa tagagaagtt cagatcaagg gcgggtacac ga -             #aaacagct    480                                                                  - - aacgttgggc caaacaggat atctgcggtg agcagtttcg gccccggccc gg -             #ggccaaga    540                                                                  - - acagatggtc accgcggttc ggccccggcc cggggccaag aacagatggt cc -             #ccagatat    600                                                                  - - ggcccaaccc tcagcagttt cttaagaccc atcagatgtt tccaggctcc cc -             #caaggacc    660                                                                  - - tgaaatgacc ctgtgcctta tttgaattaa ccaatcagcc tgcttctcgc tt -             #ctgttcgc    720                                                                  - - gcgcttctgc ttcccgagct ctataaaaga gctcacaacc cctcactcgg cg -             #cgccagtc    780                                                                  - - ctccgataga ctgagtcgcc cgggtacccg tgtatccaat aaatcctctt gc -             #tgttgcat    840                                                                  - - ccgactcgtg gtctcgctgt tccttgggag ggtctcctca gagtgattga ct -             #acccgtct    900                                                                  - - cgggggtctt tcatttgggg gctcgtccgg gatctggaga cccctgccca gg -             #gaccaccg    960                                                                  - - acccaccacc gggaggtaag ctggccaaga tccctaaggt actcgggtca ga -             #caatggcc   1020                                                                  - - cggcctttgt tgctcaggta agtcagggac tggccactca actggggata aa -             #ttggaagt   1080                                                                  - - tacattgtgc gtatagaccc cagagctcag gtcaggtaga aagaatgaac ag -             #aacaatta   1140                                                                  - - aagagacctt gaccaaatta gccttagaga ccggtggaaa agactgggtg ac -             #cctccttc   1200                                                                  - - ccttagcgct gcttagggcc aggaataccc ctggccggtt tggtttaact cc -             #ttatgaaa   1260                                                                  - - ttctctatgg aggaccaccc cccatacttg agtctggaga aactttgggt cc -             #cgatgata   1320                                                                  - - gatttctccc tgtcttattt actcacttaa aggctttaga aattgtaagg ac -             #ccaaatct   1380                                                                  - - gggaccagat caaagaggtg tataagcctg gtaccgtaac aatccctcac cc -             #gttccagg   1440                                                                  - - tcggggatca agtgcttgtc agacgccatc gacccagcag ccttgagcct cg -             #gtggaaag   1500                                                                  - - gcccatacct ggtgttgctg actaccccga ccgcggtaaa agtcgatggt at -             #tgctgcct   1560                                                                  - - gggtccatgc ttctcacctc aaacctgcac caccttcggc accagatgag tc -             #ctgggagc   1620                                                                  - - tggaaaagac tgatcatcct cttaagctgc gtattcggcg gcggcgggac ga -             #gtctgcaa   1680                                                                  - - aataagaacc cccaccagcc catgaccctc acttggcagg tactgtccca aa -             #ctggagac   1740                                                                  - - gttgtctggg atacaaaggc agtccagccc ccttggactt ggtggcccac ac -             #ttaaacct   1800                                                                  - - gatgtatgtg ccttggcggc tagtcttgag tcctgggata tcccgggaac cg -             #atgtctcg   1860                                                                  - - tcctctaaac gagtcagacc tccggactca gactatactg ccgcttataa gc -             #aaatcacc   1920                                                                  - - tggggagcca tagggtgcag ctaccctcgg gctaggacta gaatggcaag ct -             #ctaccttc   1980                                                                  - - tacgtatgtc cccgggatgg ccggaccctt tcagaagcta gaaggtgcgg gg -             #ggctagaa   2040                                                                  - - tccctatact gtaaagaatg ggattgtgag accacgggga ccggttattg gc -             #tatctaaa   2100                                                                  - - tcctcaaaag acctcataac tgtaaaatgg gaccaaaata gcgaatggac tc -             #aaaaattt   2160                                                                  - - caacagtgtc accagaccgg ctggtgtaac ccccttaaaa tagatttcac ag -             #acaaagga   2220                                                                  - - aaattatcca aggactggat aacgggaaaa acctggggat taagattcta tg -             #tgtctgga   2280                                                                  - - catccaggcg tacagttcac cattcgctta aaaatcacca acatgccagc tg -             #tggcagta   2340                                                                  - - ggtcctgacc tcgtccttgt ggaacaagga cctcctagaa cgtccctcgc tc -             #tcccacct   2400                                                                  - - cctcttcccc caagggaagc gccaccgcca tctctccccg actctaactc ca -             #cagccctg   2460                                                                  - - gcgactagtg cacaaactcc cacggtgaga aaaacaattg ttaccctaaa ca -             #ctccgcct   2520                                                                  - - cccaccacag gcgacagact ttttgatctt gtgcaggggg ccttcctaac ct -             #taaatgct   2580                                                                  - - accaacccag gggccactga gtcttgctgg ctttgtttgg ccatgggccc cc -             #cttattat   2640                                                                  - - gaagcaatag cctcatcagg agaggtcgcc tactccaccg accttgaccg gt -             #gccgctgg   2700                                                                  - - gggacccaag gaaagctcac cctcactgag gtctcaggac acgggttgtg ca -             #taggaaag   2760                                                                  - - gtgcccttta cccatcagca tctctgcaat cagaccctat ccatcaattc ct -             #ccggagac   2820                                                                  - - catcagtatc tgctcccctc caaccatagc tggtgggctt gcagcactgg cc -             #tcacccct   2880                                                                  - - tgcctctcca cctcagtttt taatcagact agagatttct gtatccaggt cc -             #agctgatt   2940                                                                  - - cctcgcatct attactatcc tgaagaagtt ttgttacagg cctatgacaa tt -             #ctcacccc   3000                                                                  - - aggactaaaa gagaggctgt ctcacttacc ctagctgttt tactggggtt gg -             #gaatcacg   3060                                                                  - - gcgggaatag gtactggttc aactgcctta attaaaggac ctatagacct cc -             #agcaaggc   3120                                                                  - - ctgacaagcc tccagatcgc catagatgct gacctccggg ccctccaaga ct -             #cagtcagc   3180                                                                  - - aagttagagg actcactgac ttccctgtcc gaggtagtgc tccaaaatag ga -             #gaggcctt   3240                                                                  - - gacttgctgt ttctaaaaga aggtggcctc tgtgcggccc taaaggaaga gt -             #gctgtttt   3300                                                                  - - tacatagacc actcaggtgc agtacgggac tccatgaaaa aactcaaaga aa -             #aactggat   3360                                                                  - - aaaagacagt tagagcgcca gaaaagccaa aactggtatg aaggatggtt ca -             #ataactcc   3420                                                                  - - ccttggttca ctaccctgct atcaaccatc gctgggcccc tattactcct cc -             #ttctgttg   3480                                                                  - - ctcatcctcg ggccatgcat catcaatcga ttagttcaat ttgttaaaga ca -             #ggatctca   3540                                                                  - - gtagtccagg ctttagtcct gactcaacaa taccaccagc taaagcctat ag -             #agtacgag   3600                                                                  - - ccatagggcg cctagtgttg acaattaatc atcggcatag tatacggcat ag -             #tataatac   3660                                                                  - - gactcactat aggagggcca ccatggccaa gttgaccagt gccgttccgg tg -             #ctcaccgc   3720                                                                  - - gcgcgacgtc gccggagcgg tcgagttctg gaccgaccgg ctcgggttct cc -             #cgggactt   3780                                                                  - - cgtggaggac gacttcgccg gtgtggtccg ggacgacgtg accctgttca tc -             #agcgcggt   3840                                                                  - - ccaggaccag gtggtgccgg acaacaccct ggcctgggtg tgggtgcgcg gc -             #ctggacga   3900                                                                  - - gctgtacgcc gagtggtcgg aggtcgtgtc cacgaacttc cgggacgcct cc -             #gggccggc   3960                                                                  - - catgaccgag atcggcgagc agccgtgggg gcgggagttc gccctgcgcg ac -             #ccggccgg   4020                                                                  - - caactgcgtg cacttcgtgg ccgaggagca ggactgannn ncggaccggt cg -             #acttgtta   4080                                                                  - - acttgtttat tgcagcttat aatggttaca aataaagcaa tagcatcaca aa -             #tttcacaa   4140                                                                  - - ataaagcatt tttttcactg cattctagtt gtggtttgtc caaactcatc aa -             #tgtatctt   4200                                                                  - - atcatgtctg gatccagatc tgggcccatg cggccgcgga tcgatnnnna ca -             #tgtgagca   4260                                                                  - - aaaggccagc aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt tt -             #tccatagg   4320                                                                  - - ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg gc -             #gaaacccg   4380                                                                  - - acaggactat aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ct -             #ctcctgtt   4440                                                                  - - ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cg -             #tggcgctt   4500                                                                  - - tctcaatgct cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc ca -             #agctgggc   4560                                                                  - - tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ct -             #atcgtctt   4620                                                                  - - gagtccaacc cggtaagaca cgacttatcg ccactggcag cagccactgg ta -             #acaggatt   4680                                                                  - - agcagagcga ggtatgtagg cggtgctaca gagttcttga agtggtggcc ta -             #actacggc   4740                                                                  - - tacactagaa ggacagtatt tggtatctgc gctctgctga agccagttac ct -             #tcggaaaa   4800                                                                  - - agagttggta gctcttgatc cggcaaacaa accaccgctg gtagcggtgg tt -             #tttttgtt   4860                                                                  - - tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt ga -             #tcttttct   4920                                                                  - - acggggtctg acgctcagtg gaacgaaaac tcacgttaag ggattttggt ca -             #tgagatta   4980                                                                  - - tcaaaaagga tcttcaccta gatcctttta aattaaaaat gaagttttaa at -             #caatctaa   5040                                                                  - - agtatatatg agtaaacttg gtctgacagt taccaatgct taatcagtga gg -             #cacctatc   5100                                                                  - - tcagcgatct gtctatttcg ttcatccata gttgcctgac tccccgtcgt gt -             #agataact   5160                                                                  - - acgatacggg agggcttacc atctggcccc agtgctgcaa tgataccgcg ag -             #acccacgc   5220                                                                  - - tcaccggctc cagatttatc agcaataaac cagccagccg gaagggccga gc -             #gcagaagt   5280                                                                  - - ggtcctgcaa ctttatccgc ctccatccag tctattaatt gttgccggga ag -             #ctagagta   5340                                                                  - - agtagttcgc cagttaatag tttgcgcaac gttgttgcca ttgctacagg ca -             #tcgtggtg   5400                                                                  - - tcacgctcgt cgtttggtat ggcttcattc agctccggtt cccaacgatc aa -             #ggcgagtt   5460                                                                  - - acatgatccc ccatgttgtg caaaaaagcg gttagctcct tcggtcctcc ga -             #tcgttgtc   5520                                                                  - - agaagtaagt tggccgcagt gttatcactc atggttatgg cagcactgca ta -             #attctctt   5580                                                                  - - actgtcatgc catccgtaag atgcttttct gtgactggtg agtactcaac ca -             #agtcattc   5640                                                                  - - tgagaatagt gtatgcggcg accgagttgc tcttgcccgg cgtcaatacg gg -             #ataatacc   5700                                                                  - - gcgccacata gcagaacttt aaaagtgctc atcattggaa aacgttcttc gg -             #ggcgaaaa   5760                                                                  - - ctctcaagga tcttaccgct gttgagatcc agttcgatgt aacccactcg tg -             #cacccaac   5820                                                                  - - tgatcttcag catcttttac tttcaccagc gtttctgggt gagcaaaaac ag -             #gaaggcaa   5880                                                                  - - aatgccgcaa aaaagggaat aagggcgaca cggaaatgtt gaatactcat ac -             #tcttcctt   5940                                                                  - - tttcaatatt attgaagcat ttatcagggt tattgtctca tgagcggata ca -             #tatttgaa   6000                                                                  - - tgtatttaga aaaataaaca aataggggtt ccgcgcacat ttccccgaaa ag -             #tgccacct   6060                                                                  - - gacgtctaag aaaccattat tatcatgaca ttaacctata aaaataggcg ta -             #tcacgagg   6120                                                                  - - ccctttcgtc tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gc -             #agctcccg   6180                                                                  - - gagacggtca cagcttgtct gtaagcggat gccgggagca gacaagcccg tc -             #agggcgcg   6240                                                                  - - tcagcgggtg ttggcgggtg tcggggctgg cttaactatg cggcatcaga gc -             #agattgta   6300                                                                  - - ctgagagtgc ac              - #                  - #                       - #     6312                                                                   - -  - - <210> SEQ ID NO 8                                                    <211> LENGTH: 5865                                                             <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Portion       of                                                                                     construct                                                                <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3611)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3612)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3613)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3614)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3799)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3800)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3801)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3802)                                                         <223> OTHER INFORMATION: n is any nucleotide                                    - - <400> SEQUENCE: 8                                                          - - catatgcggt gtgaaatacc gcacagatgc gtaaggagaa aataccgcat ca -             #ggcgccat     60                                                                  - - tcgccattca ggctgcgcaa ctgttgggaa gggcgatcgg tgcgggcctc tt -             #cgctatta    120                                                                  - - cgccagctgg cgaaaggggg atgtgctgca aggcgattaa gttgggtaac gc -             #cagggttt    180                                                                  - - tcccagtcac gacgttgtaa aacgacggcc agtgaattcc gattagttca at -             #ttgttaaa    240                                                                  - - gacaggatct cagtagtcca ggctttagtc ctgactcaac aataccacca gc -             #taaaacca    300                                                                  - - ctagaatacg agccacaata aataaaagat tttatttagt ttccagaaaa ag -             #gggggaat    360                                                                  - - gaaagacccc accaaattgc ttagcctgat agccgcagta acgccatttt gc -             #aaggcatg    420                                                                  - - gaaaaatacc aaaccaagaa tagagaagtt cagatcaagg gcgggtacac ga -             #aaacagct    480                                                                  - - aacgttgggc caaacaggat atctgcggtg agcagtttcg gccccggccc gg -             #ggccaaga    540                                                                  - - acagatggtc accgcggttc ggccccggcc cggggccaag aacagatggt cc -             #ccagatat    600                                                                  - - ggcccaaccc tcagcagttt cttaagaccc atcagatgtt tccaggctcc cc -             #caaggacc    660                                                                  - - tgaaatgacc ctgtgcctta tttgaattaa ccaatcagcc tgcttctcgc tt -             #ctgttcgc    720                                                                  - - gcgcttctgc ttcccgagct ctataaaaga gctcacaacc cctcactcgg cg -             #cgccagtc    780                                                                  - - ctccgataga ctgagtcgcc cgggtacccg tgtatccaat aaatcctctt gc -             #tgttgcat    840                                                                  - - ccgactcgtg gtctcgctgt tccttgggag ggtctcctca gagtgattga ct -             #acccgtct    900                                                                  - - cgggggtctt tcatttgggg gctcgtccgg gatctggaga cccctgccca gg -             #gaccaccg    960                                                                  - - acccaccacc gggaggtaag ctggccaaga tcccccgggc tgcaggaatt ta -             #tgaaatcc   1020                                                                  - - tttatggggg acccccccct ttgtcaacct tgctcaattc cttctccccc tc -             #cgatccta   1080                                                                  - - agactgattt acaagcccga ctaaaagggc tgcaaggcgt gcaggcccaa at -             #ctggacac   1140                                                                  - - ccctggccga attgtaccgg ccaggacatc cacaaactag ccacccattt ca -             #ggtgggag   1200                                                                  - - actccgtgta cgtccggcgg caccgctctc aaggattgga gcctcgttgg aa -             #gggacctt   1260                                                                  - - acatcgtcct gctgaccacg cccaccgcca taaaggttga cgggatcgcc gc -             #ctggattc   1320                                                                  - - acgcatcgca cgccaaggca gccccaaaaa cccctggacc agaaactccc aa -             #aacctgga   1380                                                                  - - agctccgccg ttcggagaac cctcttaaga taagactctc ccgtgtctga ct -             #gctaatcc   1440                                                                  - - accttgtccc tgtactaacc caaaatgaaa ctcccaacag gaatggtcat tt -             #tatgtagc   1500                                                                  - - ctaataatag ttcgggcagg gtttgacgac ccccgcaagg ctatcgcatt ag -             #tacaaaaa   1560                                                                  - - caacatggta aaccatgcga atgcagcgga gggcaggtat ccgaggcccc ac -             #cgaactcc   1620                                                                  - - atccaacagg taacttgccc aggcaagacg gcctacttaa tgaccaacca aa -             #aatggaaa   1680                                                                  - - tgcagagtca ctccaaaaat ctcacctagc gggggagaac tccagaactg cc -             #cctgtaac   1740                                                                  - - actttccagg actcgatgca cagttcttgt tatactgaat accggcaatg ca -             #ggcgaatt   1800                                                                  - - aataagacat actacacggc caccttgctt aaaatacggt ctgggagcct ca -             #acgaggta   1860                                                                  - - cagatattac aaaaccccaa tcagctccta cagtcccctt gtaggggctc ta -             #taaatcag   1920                                                                  - - cccgtttgct ggagtgccac agcccccatc catatctccg atggtggagg ac -             #ccctcgat   1980                                                                  - - actaagagag tgtggacagt ccaaaaaagg ctagaacaaa ttcataaggc ta -             #tgactcct   2040                                                                  - - gaacttcaat accacccctt agccctgccc aaagtcagag atgaccttag cc -             #ttgatgca   2100                                                                  - - cggacttttg atatcctgaa taccactttt aggttactcc agatgtccaa tt -             #ttagcctt   2160                                                                  - - gcccaagatt gttggctctg tttaaaacta ggtaccccta cccctcttgc ga -             #tacccact   2220                                                                  - - ccctctttaa cctactccct agcagactcc ctagcgaatg cctcctgtca ga -             #ttatacct   2280                                                                  - - cccctcttgg ttcaaccgat gcagttctcc aactcgtcct gtttatcttc cc -             #ctttcatt   2340                                                                  - - aacgatacgg aacaaataga cttaggtgca gtcaccttta ctaactgcac ct -             #ctgtagcc   2400                                                                  - - aatgtcagta gtcctttatg tgccctaaac gggtcagtct tcctctgtgg aa -             #ataacatg   2460                                                                  - - gcatacacct atttacccca aaactggacc agactttgcg tccaagcctc cc -             #tcctcccc   2520                                                                  - - gacattgaca tcaacccggg ggatgagcca gtccccattc ctgccattga tc -             #attatata   2580                                                                  - - catagaccta aacgagctgt acagttcatc cctttactag ctggactggg aa -             #tcaccgca   2640                                                                  - - gcattcacca ccggagctac aggcctaggt gtctccgtca cccagtatac aa -             #aattatcc   2700                                                                  - - catcagttaa tatctgatgt ccaagtctta tccggtacca tacaagattt ac -             #aagaccag   2760                                                                  - - gtagactcgt tagctgaagt agttctccaa aataggaggg gactggacct ac -             #taacggca   2820                                                                  - - gaacaaggag gaatttgttt agccttacaa gaaaaatgct gtttttatgc ta -             #acaagtca   2880                                                                  - - ggaattgtga gaaacaaaat aagaacccta caagaagaat tacaaaaacg ca -             #gggaaagc   2940                                                                  - - ctggcaacca accctctctg gaccgggctg cagggctttc ttccgtacct cc -             #tacctctc   3000                                                                  - - ctgggacccc tactcaccct cctactcata ctaaccattg ggccatgcgt tt -             #tcagtcgc   3060                                                                  - - ctcatggcct tcattaatga tagacttaat gttgtacatg ccatggtgct gg -             #cccagcaa   3120                                                                  - - taccaagcac tcaaagctga ggaagaagct caggattgag gcgcctagtg tt -             #gacaatta   3180                                                                  - - atcatcggca tagtatacgg catagtataa tacgactcac tataggaggg cc -             #accatggc   3240                                                                  - - caagttgacc agtgccgttc cggtgctcac cgcgcgcgac gtcgccggag cg -             #gtcgagtt   3300                                                                  - - ctggaccgac cggctcgggt tctcccggga cttcgtggag gacgacttcg cc -             #ggtgtggt   3360                                                                  - - ccgggacgac gtgaccctgt tcatcagcgc ggtccaggac caggtggtgc cg -             #gacaacac   3420                                                                  - - cctggcctgg gtgtgggtgc gcggcctgga cgagctgtac gccgagtggt cg -             #gaggtcgt   3480                                                                  - - gtccacgaac ttccgggacg cctccgggcc ggccatgacc gagatcggcg ag -             #cagccgtg   3540                                                                  - - ggggcgggag ttcgccctgc gcgacccggc cggcaactgc gtgcacttcg tg -             #gccgagga   3600                                                                  - - gcaggactga nnnncggacc ggtcgacttg ttaacttgtt tattgcagct ta -             #taatggtt   3660                                                                  - - acaaataaag caatagcatc acaaatttca caaataaagc atttttttca ct -             #gcattcta   3720                                                                  - - gttgtggttt gtccaaactc atcaatgtat cttatcatgt ctggatccag at -             #ctgggccc   3780                                                                  - - atgcggccgc ggatcgatnn nnacatgtga gcaaaaggcc agcaaaaggc ca -             #ggaaccgt   3840                                                                  - - aaaaaggccg cgttgctggc gtttttccat aggctccgcc cccctgacga gc -             #atcacaaa   3900                                                                  - - aatcgacgct caagtcagag gtggcgaaac ccgacaggac tataaagata cc -             #aggcgttt   3960                                                                  - - ccccctggaa gctccctcgt gcgctctcct gttccgaccc tgccgcttac cg -             #gatacctg   4020                                                                  - - tccgcctttc tcccttcggg aagcgtggcg ctttctcaat gctcacgctg ta -             #ggtatctc   4080                                                                  - - agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc cg -             #ttcagccc   4140                                                                  - - gaccgctgcg ccttatccgg taactatcgt cttgagtcca acccggtaag ac -             #acgactta   4200                                                                  - - tcgccactgg cagcagccac tggtaacagg attagcagag cgaggtatgt ag -             #gcggtgct   4260                                                                  - - acagagttct tgaagtggtg gcctaactac ggctacacta gaaggacagt at -             #ttggtatc   4320                                                                  - - tgcgctctgc tgaagccagt taccttcgga aaaagagttg gtagctcttg at -             #ccggcaaa   4380                                                                  - - caaaccaccg ctggtagcgg tggttttttt gtttgcaagc agcagattac gc -             #gcagaaaa   4440                                                                  - - aaaggatctc aagaagatcc tttgatcttt tctacggggt ctgacgctca gt -             #ggaacgaa   4500                                                                  - - aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac ct -             #agatcctt   4560                                                                  - - ttaaattaaa aatgaagttt taaatcaatc taaagtatat atgagtaaac tt -             #ggtctgac   4620                                                                  - - agttaccaat gcttaatcag tgaggcacct atctcagcga tctgtctatt tc -             #gttcatcc   4680                                                                  - - atagttgcct gactccccgt cgtgtagata actacgatac gggagggctt ac -             #catctggc   4740                                                                  - - cccagtgctg caatgatacc gcgagaccca cgctcaccgg ctccagattt at -             #cagcaata   4800                                                                  - - aaccagccag ccggaagggc cgagcgcaga agtggtcctg caactttatc cg -             #cctccatc   4860                                                                  - - cagtctatta attgttgccg ggaagctaga gtaagtagtt cgccagttaa ta -             #gtttgcgc   4920                                                                  - - aacgttgttg ccattgctac aggcatcgtg gtgtcacgct cgtcgtttgg ta -             #tggcttca   4980                                                                  - - ttcagctccg gttcccaacg atcaaggcga gttacatgat cccccatgtt gt -             #gcaaaaaa   5040                                                                  - - gcggttagct ccttcggtcc tccgatcgtt gtcagaagta agttggccgc ag -             #tgttatca   5100                                                                  - - ctcatggtta tggcagcact gcataattct cttactgtca tgccatccgt aa -             #gatgcttt   5160                                                                  - - tctgtgactg gtgagtactc aaccaagtca ttctgagaat agtgtatgcg gc -             #gaccgagt   5220                                                                  - - tgctcttgcc cggcgtcaat acgggataat accgcgccac atagcagaac tt -             #taaaagtg   5280                                                                  - - ctcatcattg gaaaacgttc ttcggggcga aaactctcaa ggatcttacc gc -             #tgttgaga   5340                                                                  - - tccagttcga tgtaacccac tcgtgcaccc aactgatctt cagcatcttt ta -             #ctttcacc   5400                                                                  - - agcgtttctg ggtgagcaaa aacaggaagg caaaatgccg caaaaaaggg aa -             #taagggcg   5460                                                                  - - acacggaaat gttgaatact catactcttc ctttttcaat attattgaag ca -             #tttatcag   5520                                                                  - - ggttattgtc tcatgagcgg atacatattt gaatgtattt agaaaaataa ac -             #aaataggg   5580                                                                  - - gttccgcgca catttccccg aaaagtgcca cctgacgtct aagaaaccat ta -             #ttatcatg   5640                                                                  - - acattaacct ataaaaatag gcgtatcacg aggccctttc gtctcgcgcg tt -             #tcggtgat   5700                                                                  - - gacggtgaaa acctctgaca catgcagctc ccggagacgg tcacagcttg tc -             #tgtaagcg   5760                                                                  - - gatgccggga gcagacaagc ccgtcagggc gcgtcagcgg gtgttggcgg gt -             #gtcggggc   5820                                                                  - - tggcttaact atgcggcatc agagcagatt gtactgagag tgcac   - #                     5865                                                                         - -  - - <210> SEQ ID NO 9                                                    <211> LENGTH: 3925                                                             <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Portion       of                                                                                     construct                                                                <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3910)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3911)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3912)                                                         <223> OTHER INFORMATION: n is any nucleotide                                   <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (3913)                                                         <223> OTHER INFORMATION: n is any nucleotide                                    - - <400> SEQUENCE: 9                                                          - - agatctcccg atcccctatg gtcgactctc agtacaatct gctctgatgc cg -             #catagtta     60                                                                  - - agccagtatc tgctccctgc ttgtgtgttg gaggtcgctg agtagtgcgc ga -             #gcaaaatt    120                                                                  - - taagctacaa caaggcaagg cttgaccgac aattgcatga agaatctgct ta -             #gggttagg    180                                                                  - - cgttttgcgc tgcttcgcga tgtacgggcc agatatacgc gttgacattg at -             #tattgact    240                                                                  - - agttattaat agtaatcaat tacggggtca ttagttcata gcccatatat gg -             #agttccgc    300                                                                  - - gttacataac ttacggtaaa tggcccgcct ggctgaccgc ccaacgaccc cc -             #gcccattg    360                                                                  - - acgtcaataa tgacgtatgt tcccatagta acgccaatag ggactttcca tt -             #gacgtcaa    420                                                                  - - tgggtggact atttacggta aactgcccac ttggcagtac atcaagtgta tc -             #atatgcca    480                                                                  - - agtacgcccc ctattgacgt caatgacggt aaatggcccg cctggcatta tg -             #cccagtac    540                                                                  - - atgaccttat gggactttcc tacttggcag tacatctacg tattagtcat cg -             #ctattacc    600                                                                  - - atggtgatgc ggttttggca gtacatcaat gggcgtggat agcggtttga ct -             #cacgggga    660                                                                  - - tttccaagtc tccaccccat tgacgtcaat gggagtttgt tttggcacca aa -             #atcaacgg    720                                                                  - - gactttccaa aatgtcgtaa caactccgcc ccattgacgc aaatgggcgg ta -             #ggcgtgta    780                                                                  - - cggtgggagg tctatataag cagagctctc tggctaacta gagaacccac tg -             #cttaactg    840                                                                  - - gcttatcgaa atgtcgactg agaacttcag ggtgagtttg gggacccttg at -             #tgttcttt    900                                                                  - - ctttttcgct attgtaaaat tcatgttata tggagggggc aaagttttca gg -             #gtgttgtt    960                                                                  - - tagaatggga agatgtccct tgtatcacca tggaccctca tgataatttt gt -             #ttctttca   1020                                                                  - - ctttctactc tgttgacaac cattgtctcc tcttattttc ttttcatttt ct -             #gtaacttt   1080                                                                  - - ttcgttaaac tttagcttgc atttgtaacg aatttttaaa ttcacttttg tt -             #tatttgtc   1140                                                                  - - agattgtaag tactttctct aatcactttt ttttcaaggc aatcagggta ta -             #ttatattg   1200                                                                  - - tacttcagca cagttttaga gaacaattgt tataattaaa tgataaggta ga -             #atatttct   1260                                                                  - - gcatataaat tctggctggc gtggaaatat tcttattggt agaaacaact ac -             #atcctggt   1320                                                                  - - catcatcctg cctttctctt tatggttaca atgatataca ctgtttgaga tg -             #aggataaa   1380                                                                  - - atactctgag tccaaaccgg gcccctctgc taaccatgtt catgccttct tc -             #tttttcct   1440                                                                  - - acagctcctg ggcaacgtgc tggttgttgt gctgtctcat cattttggca ag -             #gatcggcc   1500                                                                  - - ggaacagcat caggaccgac atggaaggtc cagcgttctc aaaacccctt aa -             #agataaga   1560                                                                  - - ttaacccgtg gaagtcctta atggtcatgg gggtctattt aagagtaggg at -             #ggcagaga   1620                                                                  - - gcccccatca ggtctttaat gtaacctgga gagtcaccaa cctgatgact gg -             #gcgtaccg   1680                                                                  - - ccaatgccac ctccctttta ggaactgtac aagatgcctt cccaagatta ta -             #ttttgatc   1740                                                                  - - tatgtgatct ggtcggagaa gagtgggacc cttcagacca ggaaccatat gt -             #cgggtatg   1800                                                                  - - gctgcaaata ccccggaggg agaaagcgga cccggacttt tgacttttac gt -             #gtgccctg   1860                                                                  - - ggcataccgt aaaatcgggg tgtggggggc caagagaggg ctactgtggt ga -             #atggggtt   1920                                                                  - - gtgaaaccac cggacaggct tactggaagc ccacatcatc atgggaccta at -             #ctccctta   1980                                                                  - - agcgcggtaa caccccctgg gacacgggat gctccaaaat ggcttgtggc cc -             #ctgctacg   2040                                                                  - - acctctccaa agtatccaat tccttccaag gggctactcg agggggcaga tg -             #caaccctc   2100                                                                  - - tagtcctaga attcactgat gcaggaaaaa aggctaattg ggacgggccc aa -             #atcgtggg   2160                                                                  - - gactgagact gtaccggaca ggaacagatc ctattaccat gttctccctg ac -             #ccgccagg   2220                                                                  - - tcctcaatat agggccccgc atccccattg ggcctaatcc cgtgatcact gg -             #tcaactac   2280                                                                  - - ccccctcccg acccgtgcag atcaggctcc ccaggcctcc tcagcctcct cc -             #tacaggcg   2340                                                                  - - cagcctctat agtccctgag actgccccac cttctcaaca acctgggacg gg -             #agacaggc   2400                                                                  - - tgctaaacct ggtagaagga gcctatcagg cgcttaacct caccaatccc ga -             #caagaccc   2460                                                                  - - aagaatgttg gctgtgctta gtgtcgggac ctccttatta cgaaggagta gc -             #ggtcgtgg   2520                                                                  - - gcacttatac caatcattct accgccccgg ccagctgtac ggccacttcc ca -             #acataagc   2580                                                                  - - ttaccctatc tgaagtgaca ggacagggcc tatgcatggg agcactacct aa -             #aactcacc   2640                                                                  - - aggccttatg taacaccacc caaagtgccg gctcaggatc ctactacctt gc -             #agcacccg   2700                                                                  - - ctggaacaat gtgggcttgt agcactggat tgactccctg cttgtccacc ac -             #gatgctca   2760                                                                  - - atctaaccac agactattgt gtattagttg agctctggcc cagaataatt ta -             #ccactccc   2820                                                                  - - ccgattatat gtatggtcag cttgaacagc gtaccaaata taagagggag cc -             #agtatcgt   2880                                                                  - - tgaccctggc ccttctgcta ggaggattaa ccatgggagg gattgcagct gg -             #aataggga   2940                                                                  - - cggggaccac tgccctaatc aaaacccagc agtttgagca gcttcacgcc gc -             #tatccaga   3000                                                                  - - cagacctcaa cgaagtcgaa aaatcaatta ccaacctaga aaagtcactg ac -             #ctcgttgt   3060                                                                  - - ctgaagtagt cctacagaac cgaagaggcc tagatttgct cttcctaaaa ga -             #gggaggtc   3120                                                                  - - tctgcgcagc cctaaaagaa gaatgttgtt tttatgcaga ccacacggga ct -             #agtgagag   3180                                                                  - - acagcatggc caaactaagg gaaaggctta atcagagaca aaaactattt ga -             #gtcaggcc   3240                                                                  - - aaggttggtt cgaagggcag tttaatagat ccccctggtt taccacctta at -             #ctccacca   3300                                                                  - - tcatgggacc tctaatagta ctcttactga tcttactctt tggaccctgc at -             #tctcaatc   3360                                                                  - - gattagttca atttgttaaa gacaggatct cagtagtcca ggctttagtc ct -             #gactcaac   3420                                                                  - - aataccacca gctaaagcct atagagtacg agccataggg cgcctagtgt tg -             #acaattaa   3480                                                                  - - tcatcggcat agtatacggc atagtataat acgactcact ataggagggc ca -             #ccatggcc   3540                                                                  - - aagttgacca gtgccgttcc ggtgctcacc gcgcgcgacg tcgccggagc gg -             #tcgagttc   3600                                                                  - - tggaccgacc ggctcgggtt ctcccgggac ttcgtggagg acgacttcgc cg -             #gtgtggtc   3660                                                                  - - cgggacgacg tgaccctgtt catcagcgcg gtccaggacc aggtggtgcc gg -             #acaacacc   3720                                                                  - - ctggcctggg tgtgggtgcg cggcctggac gagctgtacg ccgagtggtc gg -             #aggtcgtg   3780                                                                  - - tccacgaact tccgggacgc ctccgggccg gccatgaccg agatcggcga gc -             #agccgtgg   3840                                                                  - - gggcgggagt tcgccctgcg cgacccggcc ggcaactgcg tgcacttcgt gg -             #ccgaggag   3900                                                                  - - caggactgan nnncggaccg gtcga          - #                  - #                  3925                                                                      - -  - - <210> SEQ ID NO 10                                                   <211> LENGTH: 58                                                               <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence:                      Oligonucleotide                                                           - - <400> SEQUENCE: 10                                                         - - cggaattcgg atccgagctc ggcccagccg gccaccatga aaacatttaa ca - #tttctc            58                                                                         - -  - - <210> SEQ ID NO 11                                                   <211> LENGTH: 32                                                               <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence:                      Oligonucleotide                                                           - - <400> SEQUENCE: 11                                                         - - gatccatcga taagcttggt ggtaaaactt tt       - #                  - #               32                                                                       - -  - - <210> SEQ ID NO 12                                                   <211> LENGTH: 20                                                               <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence:                      Oligonucleotide                                                           - - <400> SEQUENCE: 12                                                         - - gctcttcgga ccctgcattc            - #                  - #                       - # 20                                                                    - -  - - <210> SEQ ID NO 13                                                   <211> LENGTH: 34                                                               <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence:                      Oligonucleotide                                                           - - <400> SEQUENCE: 13                                                         - - tagcatggcg ccctatggct cgtactctat aggc       - #                  -       #        34                                                                       - -  - - <210> SEQ ID NO 14                                                   <211> LENGTH: 20                                                               <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Primer         - - <400> SEQUENCE: 14                                                         - - cgcctcatgg ccttcattaa            - #                  - #                       - # 20                                                                    - -  - - <210> SEQ ID NO 15                                                   <211> LENGTH: 31                                                               <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Primer          - - <400> SEQUENCE: 15                                                         - - tagcatggcg cctcaatcct gagcttcttc c        - #                  - #               31                                                                       - -  - - <210> SEQ ID NO 16                                                   <211> LENGTH: 20                                                               <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Primer          - - <400> SEQUENCE: 16                                                         - - tctcgcttct gttcgcgcgc            - #                  - #                       - # 20                                                                    - -  - - <210> SEQ ID NO 17                                                   <211> LENGTH: 39                                                               <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Primer          - - <400> SEQUENCE: 17                                                         - - tcgatcaagc ttgcggccgc ggtggtgggt cggtggtcc      - #                       - #    39                                                                       - -  - - <210> SEQ ID NO 18                                                   <211> LENGTH: 23                                                               <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Primer          - - <400> SEQUENCE: 18                                                         - - ctctggctca cagtacgacg tag           - #                  - #                     23                                                                       - -  - - <210> SEQ ID NO 19                                                   <211> LENGTH: 23                                                               <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Primer          - - <400> SEQUENCE: 19                                                         - - ccatcaatcc ggtaggtttt ccg           - #                  - #                     23                                                                       - -  - - <210> SEQ ID NO 20                                                   <211> LENGTH: 23                                                               <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Primer          - - <400> SEQUENCE: 20                                                         - - carrgkttca araacwsycc cac           - #                  - #                     23                                                                       - -  - - <210> SEQ ID NO 21                                                   <211> LENGTH: 21                                                               <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Primer         <220> FEATURE:                                                                 <221> NAME/KEY: misc.sub.-- feature                                            <222> LOCATION: (12)                                                           <223> OTHER INFORMATION: n is any nucleotide                                    - - <400> SEQUENCE: 21                                                         - - agyarvgtag cngggtthag g           - #                  - #                       - #21                                                                    - -  - - <210> SEQ ID NO 22                                                   <211> LENGTH: 26                                                               <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Primer          - - <400> SEQUENCE: 22                                                         - - tccccttgga atactcctgt tttygt          - #                  - #                   26                                                                       - -  - - <210> SEQ ID NO 23                                                   <211> LENGTH: 27                                                               <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Primer          - - <400> SEQUENCE: 23                                                         - - cattccttgt ggtaaaactt tccaytg          - #                  - #                  27                                                                       - -  - - <210> SEQ ID NO 24                                                   <211> LENGTH: 20                                                               <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Primer          - - <400> SEQUENCE: 24                                                         - - cctcaccctg atcacryttg            - #                  - #                       - # 20                                                                    - -  - - <210> SEQ ID NO 25                                                   <211> LENGTH: 21                                                               <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Primer          - - <400> SEQUENCE: 25                                                         - - gaattatgtc tgacagaagg g           - #                  - #                       - #21                                                                    - -  - - <210> SEQ ID NO 26                                                   <211> LENGTH: 23                                                               <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Primer          - - <400> SEQUENCE: 26                                                         - - gttgacatct gcagagaaag acc           - #                  - #                     23                                                                       - -  - - <210> SEQ ID NO 27                                                   <211> LENGTH: 23                                                               <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Primer          - - <400> SEQUENCE: 27                                                         - - tctgaggtct gtacacacaa tgg           - #                  - #                     23                                                                       - -  - - <210> SEQ ID NO 28                                                   <211> LENGTH: 167                                                              <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Portion       of                                                                                     construct                                                                 - - <400> SEQUENCE: 28                                                         - - tctagactga catggcgcgt tcaacgctct caaaacccct taaaaataag gt -             #taacccgc     60                                                                  - - gaggccccct aatcccctta attcttctga tgctcagagg ggtcagtact gc -             #ttcgcccg    120                                                                  - - gctccagtgc ggcccagccg gccaccatga aaacatttaa catttct   - #                    167                                                                         - -  - - <210> SEQ ID NO 29                                                   <211> LENGTH: 103                                                              <212> TYPE: DNA                                                                <213> ORGANISM: Artificial Sequence                                            <220> FEATURE:                                                                 <223> OTHER INFORMATION: Description of Artificial - #Sequence: Portion       of                                                                                     construct                                                                 - - <400> SEQUENCE: 29                                                         - - tacgagccat agggcgccta gtgttgacaa ttaatcatcg gcatagtata cg -             #gcatagta     60                                                                  - - taatacgact cactatagga gggccaccat ggccaagttg acc    - #                       - #103                                                                    __________________________________________________________________________ 

We claim:
 1. A recombinant expression vector comprising a gene of interest and a selectable marker gene, wherein the selectable marker gene is arranged downstream of the gene of interest and a stop codon associated with the gene of interest is spaced from a start codon of said selectable marker gene at a distance which is sufficient to ensure that said selectable marker protein is expressed from the corresponding mRNA as a result of translation reinitiation.
 2. A recombinant expression vector according to claim 1 wherein the vector is a viral vector.
 3. A recombinant expression vector according to claim 1 wherein the gene of interest is included as part of a viral packaging construct.
 4. A recombinant expression vector according to claim 1, wherein the number of nucleotides in the space between the stop codon of the gene of interest and the start codon of the selectable marker is in the range of from 20 to 200 nucleotides.
 5. A host cell transformed with a recombinant expression vector according to claim
 1. 6. A retroviral packaging cell line according to claim 1, wherein a packaging-deficient construct comprising a viral env gene and second selectable marker is the FBdelPASAF (SEQ ID No. 5), the FBdelPMOSAF (SEQ ID No. 6), FbdelPGASAF (SEQ ID No. 7), the FbdelPRDSAF (SEQ ID No. 8), the FbdelPXSAF (FIG. 3), the FbdelP10A1SAF (FIG. 3), or the FbdelPVSVGSAF (FIG. 3) expression construct.
 7. A recombinant expression vector according to claim 2 wherein the vector is a retroviral vector.
 8. A recombinant expression vector according to claim 7, wherein the vector is a human complement-resistant retroviral vector.
 9. A recombinant expression vector according to claim 4, wherein the number of nucleotides in the space between the stop codon of the gene of interest and the start codon of the selectable marker is in the range of from 60 to 80 nucleotides.
 10. A nucleic acid construct comprising a gene of interest and a selectable marker gene, the selectable marker gene being operably linked 3' to the gene of interest, said gene of interest associated with a stop codon spaced from a start codon of said selectable marker gene at a distance sufficient to ensure that said selectable marker protein is expressed from the corresponding mRNA as a result of translation reinitiation.
 11. A vector comprising the nucleic acid construct of claim
 10. 12. A process for producing a cell line in which a gene of interest is expressed, which process comprises:transforming host cells with a nucleic acid construct according to claim 10; selecting those cells where expression of the selectable marker gene may be detected, and growing said transformed cells in the presence of a selection agent, thereby producing a cell line expressing said gene of interest.
 13. A process according to claim 12, wherein the host cell is a eukaryotic cell.
 14. A vector as claimed in claim 11, said vector being selected from the group consisting of plasmids, recombinant retroviral vectors and viral vectors.
 15. A retroviral packaging cell line comprising a host cell transformed with a first and a second recombinant expression vector, said first recombinant expression vector having a packaging-deficient construct comprising a viral gag-pol gene and a first selectable marker gene downstream thereof, and said second recombinant expression vector having a packaging-deficient construct comprising a viral env gene and a second selectable marker gene downstream thereof; wherein the start codon of the first and second selectable markers are spaced from the stop codons of the viral gag-pol gene and the viral env gene respectively by a distance which ensures that said selectable marker protein is expressed from the corresponding mRNA as a result of translation reinitiation.
 16. A retroviral packaging cell line according to claim 15, wherein said retroviral packaging cell line is human complement-resistant.
 17. A retroviral packaging cell line according to claim 15, wherein the first selectable marker is a bsr selectable marker and the second selectable marker is a phleo selectable marker.
 18. A retroviral packaging cell line according to claim 15, wherein the packaging-deficient construct comprising the viral gag-pol gene and first selectable marker is the CeB (SEQ ID No. 2) expression construct.
 19. A retroviral packaging cell line according to claim 15, wherein recombinant expression vector is a packaging-deficient retroviral helper construct.
 20. A retroviral packaging cell line according to claim 15, wherein the viral gag-pol gene and the selectable marker are expressed under the control of a non-retroviral promoter.
 21. A retroviral packaging cell line according to claim 15, wherein the viral env gene and the selectable marker are under the control of a non-retroviral promoter.
 22. A retroviral packaging cell line according to claim 15, wherein the cell line is the HT1080 line, the TE671 line, the 3T3 line, the 293 line or the MV-1-1U line.
 23. A retroviral packaging cell line according to claim 15, wherein the retroviral packaging cell is a human HT 1080 cell and expresses RD114 envelopes.
 24. A retroviral packaging cell line according to claim 15, wherein said second recombinant expression vector is a packaging-deficient retroviral helper construct.
 25. A retroviral packaging cell line according to claim 17, wherein the retroviral packaging cells comprises human TE671 cells and express RD114 envelopes.
 26. A retroviral packaging cell line according to claim 19, wherein overlapping sequences between genomes of a retroviral vector sequence and a packaging-deficient construct are reduced by minimizing the extent of non-coding retroviral sequences in a packaging deficient genome.
 27. A retroviral packaging cell line according to claim 20, wherein the promoter is fused to rabbit beta-1 globin intron.
 28. A retroviral packaging cell line according to claim 20, wherein the promoter is a hCMV promoter.
 29. A retroviral packaging cell line according to claim 20, wherein the viral gag-pol gene and the selectable marker is a hCMV+intron (SEQ ID No. 3) or a hCMV+intronkaSD (SEQ ID No. 4) expression construct.
 30. A retroviral packaging cell line according to claim 21, wherein the promoter is fused to rabbit beta-1 globin intron.
 31. A retroviral packaging cell line according to claim 21, wherein the promoter is a hCMV promoter.
 32. A retroviral packaging cell line according to claim 21, wherein the viral env gene and the selectable marker is a CMV10A1 (SEQ ID No. 9) expression construct.
 33. A process for producing a retroviral packaging cell line in which at least one gene of interest is expressed, which process comprises:transforming host cells with a first and a second recombinant expression vector, said first recombinant expression vector having a packaging-deficient construct comprising a viral gag-pol gene, said gag-pol gene optionally being operably linked to a gene of interest and a first selectable marker gene downstream thereof, and said second recombinant expression vector having a packaging-deficient construct comprising a viral env gene, said env gene optionally being operably linked to a gene of interest and a second selectable maker gene downstream thereof; wherein the start codon of the first and second selectable markers are spaced from the stop codons of the viral gag-pol gene and the viral env gene respectively by a distance which ensures that said selectable marker protein is expressed from the corresponding mRNA as a result of translation reinitiation; and selecting transformed cells which express at least one and optionally both first and second marker genes, thereby producing a retroviral packaging cells line expressing said at least one gene of interest.
 34. A packaging deficient construct for use in a process according to claim 33, which expresses a viral gag-pol gene and a selectable marker wherein a start codon of the selectable marker is spaced from a stop codon of the viral gag-pol gene by a distance which ensures that said selectable marker protein is expressed from the corresponding mRNA as a result of translation reinitiation.
 35. A packaging deficient construct for use in a process according to claim 33, which expresses a viral env gene and a selectable marker gene; wherein a start codon of the selectable marker is spaced from a stop codon of the viral env gene by a distance which ensures that said selectable marker protein is expressed from the corresponding mRNA as a result of translation reinitiation. 